-
-
Notifications
You must be signed in to change notification settings - Fork 27
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Allows running on multiple VMs to speed up CI - [x] User manual content - [x] News - [x] Test - [x] Add to cargo-mutants own tests - [x] Test sharding is applied before shuffling
- Loading branch information
Showing
11 changed files
with
319 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,47 @@ | ||
# Parallelism | ||
|
||
After the initial test of the unmutated tree, cargo-mutants can test multiple | ||
mutants in parallel. This can give significant performance improvements, | ||
depending on the tree under test and the hardware resources available. | ||
After the initial test of the unmutated tree, cargo-mutants can run multiple | ||
builds and tests of the tree in parallel on a single machine. Separately, you can | ||
[shard](shards.md) work across multiple machines. | ||
|
||
**Caution:** `cargo build` and `cargo test` internally spawn many threads and processes and can be very resource hungry. Don't set `--jobs` too high, or your machine may thrash, run out of memory, or overheat. | ||
|
||
## Background | ||
|
||
Even though cargo builds, rustc, and Rust's test framework launch multiple | ||
processes or threads, they typically can't use all available CPU cores all the | ||
time, and many `cargo test` runs will end up using only one core waiting for the | ||
last task to complete. Running multiple jobs in parallel makes use of resources | ||
that would otherwise be idle. | ||
processes or threads, they typically spend some time waiting for straggler tasks, during which time some CPU cores are idle. For example, a cargo build commonly ends up waiting for a single-threaded linker for several seconds. | ||
|
||
Running one or more build or test tasks in parallel can use up this otherwise wasted capacity. | ||
This can give significant performance improvements, depending on the tree under test and the hardware resources available. | ||
|
||
## Timeouts | ||
|
||
Because tests may be slower with high parallelism, or may exhibit more variability in execution time, you may see some spurious timeouts, and you may need to set `--timeout` manually to allow enough safety margin. (User feedback on this is welcome.) | ||
|
||
## Non-hermetic tests | ||
|
||
By default, only one job is run at a time. | ||
If your test suite is non-hermetic -- for example, if it talks to an external database -- then running multiple jobs in parallel may cause test flakes. `cargo-mutants` is just running multiple copies of `cargo test` simultaneously: if that doesn't work in your tree, then you can't use this option. | ||
|
||
To run more, use the `--jobs` or `-j` option, or set the `CARGO_MUTANTS_JOBS` | ||
environment variable. | ||
## Choosing a job count | ||
|
||
Setting this higher than the number of CPU cores is unlikely to be helpful. | ||
You should set the number of jobs very conservatively, starting at `-j2` or `-j3`. | ||
|
||
Higher settings are only likely to be helpful on very large machines, perhaps with >100 cores and >256GB RAM. | ||
|
||
Unlike with `make`, setting `-j` proportionally to the number of cores is unlikely to work out well, because so the Rust build and test tools already parallelize very aggressively. | ||
|
||
The best setting will depend on many factors including the behavior of your | ||
program's test suite, the amount of memory on your system, and your system's | ||
behavior under high thermal load. | ||
behavior under high load. Ultimately you'll need to experiment to find the best setting. | ||
|
||
To tune the number of jobs, you can watch `htop` or some similar program while the tests are running, to see whether cores are fully utilized or whether the system is running out of memory. On laptop or desktop machines you might also want to watch the temperature of the CPU. | ||
|
||
As well as using more CPU and RAM, higher `-j` settings will also use more disk space in your temporary directory: Rust `target` directories can commonly be 2GB or more, and there will be one per parallel job, plus whatever temp files your test suite might create. | ||
|
||
## Interaction with `--test-threads` | ||
|
||
The Rust test framework exposes a `--test-threads` option controlling how many threads run inside a test binary. cargo-mutants doesn't set this, but you can set it from the command line, along with other parameters to the test binary. You might need to set this if your test suite is non-hermetic with regard to global process state. | ||
|
||
`-j 4` may be a good starting point. Start there and watch memory and CPU usage, | ||
and tune towards a setting where all cores are fully utilized without apparent | ||
thrashing, memory exhaustion, or thermal issues. | ||
Limiting the number of threads inside a single test binary would tend to make that binary less resource-hungry, and so _might_ allow you to set a higher `-j` option. | ||
|
||
Because tests may be slower with high parallelism, you may see some spurious | ||
timeouts, and you may need to set `--timeout` manually to allow enough safety | ||
margin. | ||
Reducing the number of test threads to increase `-j` seems unlikely to help performance in most trees. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# Sharding | ||
|
||
In addition to [running multiple jobs locally](parallelism.md), cargo-mutants can also run jobs on multiple machines, to get an overall job faster. | ||
|
||
Each job tests a subset of mutants, selected by a shard. Shards are described as `k/n`, where `n` is the number of shards and `k` is the index of the shard, from 0 to `n-1`. | ||
|
||
There is no runtime coordination between shards: they each independently discover the available mutants and then select a subset based on the `--shard` option. | ||
|
||
If any shard fails then that would indicate that some mutants were missed, or there was some other problem. | ||
|
||
## Consistency across shards | ||
|
||
**CAUTION:** | ||
All shards must be run with the same arguments, and the same sharding `k`, or the results will be meaningless, as they won't agree on how to divide the work. | ||
|
||
Sharding can be combined with filters or shuffling, as long as the filters are set consistently in all shards. Sharding can also combine with `--in-diff`, again as long as all shards see the same diff. | ||
|
||
## Setting up sharding | ||
|
||
Your CI system or other tooling is responsible for launching multiple shards, and for collecting the results. You're responsible for choosing the number of shards (see below). | ||
|
||
For example, in GitHub Actions, you could use a matrix job to run multiple shards: | ||
|
||
```yaml | ||
cargo-mutants: | ||
runs-on: ubuntu-latest | ||
# needs: [build, incremental-mutants] | ||
strategy: | ||
matrix: | ||
shard: [0, 1, 2, 3, 4, 5, 6, 7] | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: dtolnay/rust-toolchain@master | ||
with: | ||
toolchain: beta | ||
- uses: Swatinem/rust-cache@v2 | ||
- run: cargo install cargo-mutants | ||
- name: Mutants | ||
run: | | ||
cargo mutants --no-shuffle -vV --shard ${{ matrix.shard }}/8 | ||
- name: Archive mutants.out | ||
uses: actions/upload-artifact@v3 | ||
if: always() | ||
with: | ||
name: mutants.out | ||
path: mutants.out | ||
``` | ||
Note that the number of shards is set to match the `/8` in the `--shard` argument. | ||
|
||
## Performance of sharding | ||
|
||
Each mutant does some constant upfront work: | ||
|
||
* Any CI setup including starting the machine, getting a checkout, installing a Rust toolchain, and installing cargo-mutants | ||
* An initial clean build of the code under test | ||
* A baseline run of the unmutated code | ||
|
||
Then, for each mutant in its shard, it does an incremental build and runs all the tests. | ||
|
||
Each shard runs the same number of mutants, +/-1. Typically this will mean they each take roughly the same amount of time, although it's possible that some shards are unlucky in drawing mutants that happen to take longer to test. | ||
|
||
A rough model for the overall execution time for all of the shards, allowing for this work occuring in parallel, is | ||
|
||
```raw | ||
SHARD_STARTUP + (CLEAN_BUILD + TEST) + (N_MUTANTS/K) * (INCREMENTAL_BUILD + TEST) | ||
``` | ||
|
||
The total cost in CPU seconds can be modelled as: | ||
|
||
```raw | ||
K * (SHARD_STARTUP + CLEAN_BUILD + TEST) + N_MUTANTS * (INCREMENTAL_BUILD + TEST) | ||
``` | ||
|
||
As a result, at very large `k` the cost of the initial setup work will dominate, but overall time to solution will be minimized. | ||
|
||
## Choosing a number of shards | ||
|
||
Because there's some constant overhead for every shard there will be diminishing returns and increasing ineffiency if you use too many shards. (In the extreme cases where there are more shards than mutants, some of them will do the setup work, then find they have nothing to do and immediately exit.) | ||
|
||
As a rule of thumb, you should probably choose `k` such that each worker runs at least 10 mutants, and possibly much more. 8 to 32 shards might be a good place to start. | ||
|
||
The optimal setting probably depends on how long your tree takes to build from zero and incrementally, how long the tests take to run, and the performance of your CI system. | ||
|
||
If your CI system offers a choice of VM sizes you might experiment with using smaller or larger VMs and more or less shards: the optimal setting probably also depends on your tree's ability to exploit larger machines. | ||
|
||
You should also think about cost and capacity constraints in your CI system, and the risk of starving out other users. | ||
|
||
cargo-mutants has no internal scaling constraints to prevent you from setting `k` very large, if cost, efficiency and CI capacity are not a concern. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
// Copyright 2023 Martin Pool | ||
|
||
//! Sharding parameters. | ||
use std::str::FromStr; | ||
|
||
use anyhow::{anyhow, ensure, Context, Error}; | ||
|
||
/// Select mutants for a particular shard of the total list. | ||
#[derive(Debug, Clone, Copy, Eq, PartialEq)] | ||
pub struct Shard { | ||
/// Index modulo n. | ||
pub k: usize, | ||
/// Modulus of sharding. | ||
pub n: usize, | ||
} | ||
|
||
impl Shard { | ||
/// Select the mutants that should be run for this shard. | ||
pub fn select<M, I: IntoIterator<Item = M>>(&self, mutants: I) -> Vec<M> { | ||
mutants | ||
.into_iter() | ||
.enumerate() | ||
.filter_map(|(i, m)| if i % self.n == self.k { Some(m) } else { None }) | ||
.collect() | ||
} | ||
} | ||
|
||
impl FromStr for Shard { | ||
type Err = Error; | ||
|
||
fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
let parts = s.split_once('/').ok_or(anyhow!("shard must be k/n"))?; | ||
let k = parts.0.parse().context("shard k")?; | ||
let n = parts.1.parse().context("shard n")?; | ||
ensure!(k < n, "shard k must be less than n"); // implies n>0 | ||
Ok(Shard { k, n }) | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use std::str::FromStr; | ||
|
||
use super::*; | ||
|
||
#[test] | ||
fn shard_from_str_valid_input() { | ||
let shard = Shard::from_str("2/5").unwrap(); | ||
assert_eq!(shard.k, 2); | ||
assert_eq!(shard.n, 5); | ||
assert_eq!(shard, Shard { k: 2, n: 5 }); | ||
} | ||
|
||
#[test] | ||
fn shard_from_str_invalid_input() { | ||
assert_eq!( | ||
Shard::from_str("").unwrap_err().to_string(), | ||
"shard must be k/n" | ||
); | ||
|
||
assert_eq!( | ||
Shard::from_str("2").unwrap_err().to_string(), | ||
"shard must be k/n" | ||
); | ||
|
||
assert_eq!( | ||
Shard::from_str("2/0").unwrap_err().to_string(), | ||
"shard k must be less than n" | ||
); | ||
|
||
assert_eq!( | ||
Shard::from_str("5/2").unwrap_err().to_string(), | ||
"shard k must be less than n" | ||
); | ||
} | ||
|
||
#[test] | ||
fn shard_select() { | ||
assert_eq!( | ||
Shard::from_str("1/4").unwrap().select(0..10).as_slice(), | ||
&[1, 5, 9] | ||
); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.