Benchmarks
- Results
- Specifications

Benchmarks

Nextest’s execution model generally leads to faster test runs than Cargo. How much faster depends on the specifics, but here are some general guidelines:

Larger workspaces will see a greater benefit. This is because larger workspaces have more crates, more test binaries, and more potential spots for bottlenecks.
Test bottlenecks. Nextest excels in situations where there are bottlenecks in multiple test binaries: cargo test can only run them serially, while nextest can run those tests in parallel.
Build caching. Test runs are one component of end-to-end execution times. Speeding up the build by using sccache, the Rust Cache GitHub Action, or similar, will make test run times be a proportionally greater part of overall times.

Even if nextest doesn’t result in faster test builds, you may find doing occasional nextest runs useful for identifying test bottlenecks, for its user interface, or for its other features like test retries.

Results

Project	Revision	Test count	cargo test (s)	nextest (s)	Difference
cargo-guppy	`c135447a`	252	34.70	22.14	-36.2%
[diem][^diem1]	`6025888b`	1476	1058.46	400.53	-62.1%
penumbra	`44ab43f6`	32	54.66	24.90	-54.4%
ring	`c14c355f`	179	17.64	11.60	-34.2%
rust-analyzer	`4449a336`	3746	6.76	5.23	-22.6%
tokio	`e7a0da60`	1014	27.16	11.72	-56.8%

[^diem1]: Diem ships its own in-tree tool on top of nextest-runner, so the commands were slightly different:

the command for cargo test is cargo xtest --unit
the command for running nextest is cargo nextest --unit

Specifications

All measurements were done on:

Processor: AMD Ryzen 9 3900x x86_64, 12 cores/24 threads
Operating system: Pop_OS! 21.04 running Linux kernel 5.15.15
RAM: 64GB
Rust: version 1.58.1

Lines of code were measured by loc, while the number of tests was recorded by nextest.

The commands run were:

cargo test: cargo test --workspace --bins --lib --tests (to exclude doctests since they’re not supported by nextest)
nextest: cargo nextest run --workspace

The measurements do not include time taken to build the tests. To ensure that, each command was run 5 times in succession. The measurement recorded is the minimum of runs 3, 4 and 5.