-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Description
Since incremental compilation supports being used in conjunction with ThinLTO the runtime performance of incrementally built artifacts is (presumably) roughly on par with non-incrementally built code. At the same time, building things incrementally often is significantly faster ((1.4-5x according to perf.rlo). As a consequence it might be a good idea to make Cargo default to incremental compilation for release builds.
Possible caveats that need to be resolved:
- The initial build is slightly slower with incremental compilation, usually around 10%. We need to decide if this is a worthwhile tradeoff. For
debug
andcheck
builds everybody seems to be fine with this already.Some crates, likestyle-servo
, are always slower to compile with incr. comp., even if there is just a small change. In the case ofstyle-servo
that is 62 seconds versus 64-69 seconds on perf.rlo. It is unlikely that this would improve before we make incr. comp. the default. We need to decide if this is a justifiable price to pay for improvements in other projects.Even if incremental compilation becomes the default, one can still always opt out of it via theCARGO_INCREMENTAL
flag or a local Cargo config. However, this might not be common knowledge, the same as it isn't common knowledge that one can improve runtime performance by forcing the compiler to use just one codegen unit.It still needs to be verified that runtime performance of compiled artifacts does not suffer too much from switching to incremental compilation (see below).To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
Data on runtime performance of incrementally compiled release artifacts
Apart from anectodal evidence that runtime performance is "roughly the same" there have been two attempts to measure this in a more reliable way:
- PR [experiment] Benchmark incremental ThinLTO'd compiler. #56678 did an experiment where we compiled the compiler itself incrementally and then tested how the compiler's runtime performance was affected by this. The results are twofold:
- In general performance drops by 1-2% (compare results for
clean
builds) - For two of the small test cases (
helloworld
,unify-linearly
) performance drops by 30%. It is known that these test cases are very sensitive to LLVM making the right inlining decisions, which we already saw when switching from single-CGU to non-incremental ThinLTO. This is indicative that microbenchmarks may see performance drops unless the author of the benchmark takes care of marking bottleneck functions with#[inline]
.
- In general performance drops by 1-2% (compare results for
- For a limited period of time we made incremental compilation the default in Cargo (Make incremental compilation the default for all profiles. cargo#6564) in order to see how this affected measurements on lolbench.rs. It is not yet clear if the experiment succeeded and how much useful data it collected since we had to cut it short because of a regression (Nightly regression: Can't perform LTO when compiling incrementally #57947). The initial data looks promising: only a handful of the ~600 benchmarks showed performance losses (see https://lolbench.rs/#nightly-2019-01-27). But we need further investigation on how reliable the results are. We might also want to re-run the experiment since the regression can easily be avoided.
One more experiment we should do is compiling Firefox because it is a large Rust codebase with an excellent benchmarking infrastructure (cc @nnethercote).
cc @rust-lang/core @rust-lang/cargo @rust-lang/compiler
Activity
joshtriplett commentedon Jan 29, 2019
michaelwoerister commentedon Jan 29, 2019
I'm not sure. The current default already has a quite significant runtime performance cost because it's using ThinLTO instead of
-Ccodegen-units=1
.alexcrichton commentedon Jan 29, 2019
We've had a ton of discussions before about comiple time and runtime tradeoffs, see #45320 and #44941 for just a smattering. We are very intentionally not enabling the fastest compilation mode with
cargo build --release
by default today, and an issue like this is a continuation of that.joshtriplett commentedon Jan 30, 2019
@alexcrichton To avoid ambiguity, what do you mean by "fastest compilation mode" here?
I certainly think we don't need to worry about compiling as fast as possible, but I don't think our default compile should pay a runtime performance penalty like this.
alexcrichton commentedon Jan 30, 2019
Ah by that I mean that producing the fastest code possible. Producing the fastest code by default for
--release
would mean things like LTO, enabling PGO, customizing the LLVM pass manager to just rerun itself to either a fixed point or until some amount of time lapses, etc.Lokathor commentedon Feb 2, 2019
So if
release
is a "best effort at being fast while still finishing the build sometime today", can we just add a different profile for "really the fastest but it'll take a day to build".CryZe commentedon Feb 2, 2019
Yeah I'm honestly thinking that it may be time for a profile between debug and release, such that there is these use cases:
At the moment I'm seeing lots of people either sacrifice the debug profile for that "Development" use case (bumping optimization levels, but reducing the debugability of the project) or sacrifice the release profile by reducing optimizations, both are kind of suboptimal.
29 remaining items