Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

packaging: use release builds and LTO by default #12097

Merged
merged 1 commit into from
Jan 2, 2025

Conversation

ConnorBaker
Copy link
Contributor

@ConnorBaker ConnorBaker commented Dec 21, 2024

Motivation

Let Meson handle optimization arguments for us.

Removed use of debug and optimization explicitly in project defaults -- these should be handled by Meson's buildtype, not us.

LTO is enabled when the build type is release or minsize and disabled otherwise.

Added links to relevant references describing Meson build types and the default build type as set by Nixpkgs' setup hook for Meson.

Testing

I'm not sure what the state-of-the-art is for benchmarking Nix. Any pointers?

EDIT: I settled on gross rigging around nix-functional-tests to try to get a sense of the performance impact.

Step 1: Apply this patch so Meson doesn't randomly perturb malloc and runs tests sequentially: ConnorBaker@567a2b1.

I've done that in the two branches I use for benchmarking: https://github.com/ConnorBaker/nix/tree/feat/det-bench for the baseline and https://github.com/ConnorBaker/nix/tree/feat/meson-O3-LTO-det-bench for this PR.

Of course, re-using the functional test suite as "micro-benchmarks" comes with its own problems...

Step 2: Decide on a tool for comparing benchmark numbers.

I decided on benchstat because it has the functionality I'm looking for, it is available in Nixpkgs, and the format for input data is relatively close to what we get as output from Meson (https://go.googlesource.com/proposal/+/master/design/14313-benchmark-format.md).

Step 3: Generate and transform the data.

Here's an AWK script which matches on the lines of the build output of nix-functional-tests which include the results of each script and transforms them into something which benchstat can work with. I named it nix-functional-tests-to-gotest-format.awk:

/.+ nix-functional-tests:.+ OK .+/ {
  suite=$3
  sub(/nix-functional-tests:/, "", suite)
  script=$5
  time=$7
  sub(/s/, "", time)
  print "BenchmarkTest" "/suite=" suite "/script=" script, "1", time, "s/op"
}

And here's a bash script which run the builds over and over, transforming the build logs and collecting them in output files. I named it nix-functional-tests-gotest-bench.sh:

#!/usr/bin/env bash
set -euo pipefail

# Old:
# https://github.com/ConnorBaker/nix/commits/feat/det-bench
# 567a2b1699c5e5ecfd69fc75636dd4fedffcd123

# New:
# https://github.com/ConnorBaker/nix/commits/feat/meson-O3-LTO-det-bench
# 10d5051bf93ea0cb732d8d41149b4f41fd049bdb

main() {
  local -A named_commits=(
    [old]="567a2b1699c5e5ecfd69fc75636dd4fedffcd123"
    [new]="10d5051bf93ea0cb732d8d41149b4f41fd049bdb"
  )
  local name=""
  local commit=""
  local flake_ref=""

  for name in "${!named_commits[@]}"; do
    commit="${named_commits[$name]}"
    flake_ref="github:ConnorBaker/nix/$commit"
    echo "Building the $name nix-functional-tests so dependencies are cached..."
    nix build -L --builders '' "$flake_ref#nix-functional-tests" --no-link
  done

  echo "Starting benchmarking..."

  for i in {1..20}; do
    echo "Running iteration $i/20..."
    for name in "${!named_commits[@]}"; do
      commit="${named_commits[$name]}"
      flake_ref="github:ConnorBaker/nix/$commit"
      echo "Running $name nix-functional-tests..."
      nix build -L --builders '' "$flake_ref#nix-functional-tests" --no-link --rebuild |& awk --file nix-functional-tests-to-gotest-format.awk >> "nix-functional-tests-$name.txt"
    done
  done
}

main

Step 4: Summarize the data.

I'm seeing (if I'm reading this right) roughly 5-10% improvements across most of the functional tests. This is on a NixOS 25.05 (nixos-unstable) system with an i9-13900k and benchstat nix-functional-tests-old.txt nix-functional-tests-new.txt:

                                                               │ nix-functional-tests-old.txt │     nix-functional-tests-new.txt     │
                                                               │             s/op             │     s/op      vs base                │
Test/suite=main/script=test-infra                                                90.00m ± 11%   90.00m ±  0%    0.00% (p=0.008 n=20)
Test/suite=main/script=gc                                                        290.0m ±  3%   280.0m ±  4%   -3.45% (p=0.001 n=20)
Test/suite=main/script=nix-collect-garbage-d                                     340.0m ±  3%   325.0m ±  5%   -4.41% (p=0.000 n=20)
Test/suite=main/script=remote-store                                               1.785 ±  1%    1.630 ±  2%   -8.68% (p=0.000 n=20)
Test/suite=main/script=legacy-ssh-store                                          100.0m ±  0%   100.0m ± 10%    0.00% (p=0.003 n=20)
Test/suite=main/script=lang                                                       3.945 ±  1%    3.690 ±  1%   -6.46% (p=0.000 n=20)
Test/suite=main/script=characterisation-test-infra                               50.00m ±  0%   50.00m ±  0%        ~ (p=1.000 n=20)
Test/suite=main/script=experimental-features                                     345.0m ±  1%   330.0m ±  3%   -4.35% (p=0.000 n=20)
Test/suite=main/script=fetchMercurial                                             4.740 ±  0%    4.710 ±  0%   -0.63% (p=0.003 n=20)
Test/suite=main/script=gc-auto                                                    10.17 ±  0%    10.15 ±  0%   -0.20% (p=0.004 n=20)
Test/suite=main/script=user-envs                                                  1.360 ±  1%    1.250 ±  2%   -8.09% (p=0.000 n=20)
Test/suite=main/script=binary-cache                                               1.240 ±  2%    1.155 ±  2%   -6.85% (p=0.000 n=20)
Test/suite=main/script=multiple-outputs                                          495.0m ±  1%   460.0m ±  2%   -7.07% (p=0.000 n=20)
Test/suite=main/script=nix-build                                                 340.0m ±  3%   310.0m ±  3%   -8.82% (p=0.000 n=20)
Test/suite=main/script=gc-concurrent                                             240.0m ±  4%   240.0m ±  4%    0.00% (p=0.000 n=20)
Test/suite=main/script=repair                                                    505.0m ±  1%   470.0m ±  2%   -6.93% (p=0.000 n=20)
Test/suite=main/script=fixed                                                      7.410 ±  0%    7.380 ±  0%   -0.40% (p=0.000 n=20)
Test/suite=main/script=export-graph                                              500.0m ±  2%   450.0m ±  2%  -10.00% (p=0.000 n=20)
Test/suite=main/script=timeout                                                    6.200 ±  0%    6.190 ±  0%   -0.16% (p=0.000 n=20)
Test/suite=main/script=fetchGitRefs                                              935.0m ±  3%   865.0m ±  2%   -7.49% (p=0.000 n=20)
Test/suite=main/script=gc-runtime                                                 2.140 ±  0%    2.130 ±  0%   -0.47% (p=0.000 n=20)
Test/suite=main/script=tarball                                                  1020.0m ±  2%   955.0m ±  3%   -6.37% (p=0.000 n=20)
Test/suite=main/script=fetchGit                                                   1.275 ±  2%    1.210 ±  2%   -5.10% (p=0.000 n=20)
Test/suite=main/script=fetchurl                                                  340.0m ±  3%   320.0m ±  3%   -5.88% (p=0.000 n=20)
Test/suite=main/script=fetchPath                                                 60.00m ± 17%   60.00m ±  0%    0.00% (p=0.031 n=20)
Test/suite=main/script=fetchTree-file                                            150.0m ±  7%   140.0m ±  0%   -6.67% (p=0.000 n=20)
Test/suite=main/script=simple                                                    260.0m ±  4%   240.0m ±  4%   -7.69% (p=0.000 n=20)
Test/suite=main/script=referrers                                                 150.0m ±  7%   150.0m ±  7%        ~ (p=1.000 n=20)
Test/suite=main/script=optimise-store                                            170.0m ±  6%   170.0m ±  6%    0.00% (p=0.001 n=20)
Test/suite=main/script=substitute-with-invalid-ca                                150.0m ±  0%   140.0m ±  0%   -6.67% (p=0.000 n=20)
Test/suite=main/script=signing                                                   645.0m ±  4%   585.0m ±  1%   -9.30% (p=0.000 n=20)
Test/suite=main/script=hash-convert                                               1.340 ±  2%    1.165 ±  1%  -13.06% (p=0.000 n=20)
Test/suite=main/script=hash-path                                                1035.0m ±  2%   900.0m ±  2%  -13.04% (p=0.000 n=20)
Test/suite=main/script=gc-non-blocking                                            3.130 ±  2%    3.100 ±  3%   -0.96% (p=0.008 n=20)
Test/suite=main/script=check                                                     495.0m ±  3%   465.0m ±  1%   -6.06% (p=0.000 n=20)
Test/suite=main/script=nix-shell                                                 865.0m ±  3%   795.0m ±  2%   -8.09% (p=0.000 n=20)
Test/suite=main/script=check-refs                                                495.0m ±  1%   455.0m ±  1%   -8.08% (p=0.000 n=20)
Test/suite=main/script=build-remote-input-addressed                              645.0m ±  2%   590.0m ±  2%   -8.53% (p=0.000 n=20)
Test/suite=main/script=secure-drv-outputs                                        370.0m ±  3%   350.0m ±  3%   -5.41% (p=0.000 n=20)
Test/suite=main/script=restricted                                                485.0m ±  3%   450.0m ±  4%   -7.22% (p=0.000 n=20)
Test/suite=main/script=fetchGitSubmodules                                        700.0m ±  1%   670.0m ±  1%   -4.29% (p=0.000 n=20)
Test/suite=main/script=readfile-context                                          110.0m ±  0%   100.0m ±  0%   -9.09% (p=0.000 n=20)
Test/suite=main/script=nix-channel                                               470.0m ±  2%   430.0m ±  2%   -8.51% (p=0.000 n=20)
Test/suite=main/script=recursive                                                 310.0m ±  3%   280.0m ±  0%   -9.68% (p=0.000 n=20)
Test/suite=main/script=dependencies                                              330.0m ±  3%   310.0m ±  3%   -6.06% (p=0.000 n=20)
Test/suite=main/script=check-reqs                                                300.0m ±  3%   280.0m ±  4%   -6.67% (p=0.000 n=20)
Test/suite=main/script=build-remote-content-addressed-fixed                      340.0m ±  3%   310.0m ±  0%   -8.82% (p=0.000 n=20)
Test/suite=main/script=build-remote-content-addressed-floating                   340.0m ±  3%   320.0m ±  3%   -5.88% (p=0.000 n=20)
Test/suite=main/script=build-remote-trustless-should-pass-0                      180.0m ±  6%   170.0m ±  0%   -5.56% (p=0.000 n=20)
Test/suite=main/script=build-remote-trustless-should-pass-1                      180.0m ±  0%   170.0m ±  0%   -5.56% (p=0.000 n=20)
Test/suite=main/script=build-remote-trustless-should-pass-2                       5.250 ±  0%    5.240 ±  0%   -0.19% (p=0.000 n=20)
Test/suite=main/script=build-remote-trustless-should-pass-3                      190.0m ±  0%   175.0m ±  3%   -7.89% (p=0.000 n=20)
Test/suite=main/script=build-remote-trustless-should-fail-0                       5.160 ±  0%    5.150 ±  0%   -0.19% (p=0.000 n=20)
Test/suite=main/script=build-remote-with-mounted-ssh-ng                          120.0m ±  0%   120.0m ±  8%    0.00% (p=0.034 n=20)
Test/suite=main/script=nar-access                                                290.0m ±  3%   270.0m ±  4%   -6.90% (p=0.000 n=20)
Test/suite=main/script=impure-eval                                               180.0m ±  0%   160.0m ±  6%  -11.11% (p=0.000 n=20)
Test/suite=main/script=pure-eval                                                 250.0m ±  4%   230.0m ±  4%   -8.00% (p=0.000 n=20)
Test/suite=main/script=eval                                                      400.0m ±  3%   375.0m ±  4%   -6.25% (p=0.000 n=20)
Test/suite=main/script=repl                                                       2.000 ±  0%    1.950 ±  1%   -2.50% (p=0.000 n=20)
Test/suite=main/script=binary-cache-build-remote                                 230.0m ±  4%   220.0m ±  5%   -4.35% (p=0.000 n=20)
Test/suite=main/script=search                                                    295.0m ±  2%   280.0m ±  4%   -5.08% (p=0.000 n=20)
Test/suite=main/script=logging                                                   250.0m ±  4%   240.0m ±  4%   -4.00% (p=0.001 n=20)
Test/suite=main/script=export                                                    220.0m ±  0%   210.0m ±  5%   -4.55% (p=0.000 n=20)
Test/suite=main/script=config                                                    250.0m ±  4%   230.0m ±  4%   -8.00% (p=0.000 n=20)
Test/suite=main/script=add                                                       250.0m ±  4%   230.0m ±  4%   -8.00% (p=0.000 n=20)
Test/suite=main/script=chroot-store                                              150.0m ±  7%   140.0m ±  0%   -6.67% (p=0.000 n=20)
Test/suite=main/script=filter-source                                             90.00m ± 11%   80.00m ±  0%  -11.11% (p=0.025 n=20)
Test/suite=main/script=misc                                                      230.0m ±  4%   220.0m ±  5%   -4.35% (p=0.000 n=20)
Test/suite=main/script=dump-db                                                   160.0m ±  6%   145.0m ±  3%   -9.37% (p=0.000 n=20)
Test/suite=main/script=linux-sandbox                                             540.0m ±  2%   505.0m ±  1%   -6.48% (p=0.000 n=20)
Test/suite=main/script=supplementary-groups                                      180.0m ±  6%   180.0m ±  0%    0.00% (p=0.006 n=20)
Test/suite=main/script=build-dry                                                 260.0m ±  4%   240.0m ±  4%   -7.69% (p=0.000 n=20)
Test/suite=main/script=structured-attrs                                          270.0m ±  0%   250.0m ±  4%   -7.41% (p=0.000 n=20)
Test/suite=main/script=shell                                                     360.0m ±  0%   340.0m ±  3%   -5.56% (p=0.000 n=20)
Test/suite=main/script=brotli                                                    150.0m ±  0%   140.0m ±  0%   -6.67% (p=0.000 n=20)
Test/suite=main/script=zstd                                                      150.0m ±  0%   140.0m ±  0%   -6.67% (p=0.000 n=20)
Test/suite=main/script=compression-levels                                        125.0m ±  4%   120.0m ±  0%   -4.00% (p=0.002 n=20)
Test/suite=main/script=nix-copy-ssh                                              200.0m ±  5%   190.0m ±  5%   -5.00% (p=0.000 n=20)
Test/suite=main/script=nix-copy-ssh-ng                                           340.0m ±  3%   310.0m ±  3%   -8.82% (p=0.000 n=20)
Test/suite=main/script=post-hook                                                 280.0m ±  4%   260.0m ±  4%   -7.14% (p=0.000 n=20)
Test/suite=main/script=function-trace                                            130.0m ±  8%   120.0m ±  0%   -7.69% (p=0.000 n=20)
Test/suite=main/script=fmt                                                       150.0m ±  0%   140.0m ±  0%   -6.67% (p=0.000 n=20)
Test/suite=main/script=eval-store                                                230.0m ±  4%   210.0m ±  0%   -8.70% (p=0.000 n=20)
Test/suite=main/script=why-depends                                               180.0m ±  6%   170.0m ±  6%   -5.56% (p=0.000 n=20)
Test/suite=main/script=derivation-json                                           90.00m ±  0%   90.00m ±  0%    0.00% (p=0.023 n=20)
Test/suite=main/script=derivation-advanced-attributes                            70.00m ±  0%   70.00m ± 14%    0.00% (p=0.007 n=20)
Test/suite=main/script=import-from-derivation                                    290.0m ±  3%   260.0m ±  4%  -10.34% (p=0.000 n=20)
Test/suite=main/script=nix_path                                                  360.0m ±  3%   330.0m ±  3%   -8.33% (p=0.000 n=20)
Test/suite=main/script=nars                                                      470.0m ±  2%   430.0m ±  2%   -8.51% (p=0.000 n=20)
Test/suite=main/script=placeholders                                              70.00m ±  0%   70.00m ±  0%        ~ (p=1.000 n=20)
Test/suite=main/script=ssh-relay                                                100.00m ± 10%   90.00m ± 11%  -10.00% (p=0.000 n=20)
Test/suite=main/script=build                                                     610.0m ±  2%   570.0m ±  2%   -6.56% (p=0.000 n=20)
Test/suite=main/script=build-delete                                              380.0m ±  3%   360.0m ±  3%   -5.26% (p=0.000 n=20)
Test/suite=main/script=output-normalization                                      80.00m ± 12%   70.00m ±  0%  -12.50% (p=0.004 n=20)
Test/suite=main/script=selfref-gc                                                90.00m ±  0%   80.00m ±  0%  -11.11% (p=0.001 n=20)
Test/suite=main/script=bash-profile                                              40.00m ± 25%   40.00m ±  0%        ~ (p=0.176 n=20)
Test/suite=main/script=pass-as-file                                              70.00m ±  0%   70.00m ±  0%        ~ (p=0.184 n=20)
Test/suite=main/script=nix-profile                                                1.260 ±  2%    1.155 ±  3%   -8.33% (p=0.000 n=20)
Test/suite=main/script=suggestions                                               100.0m ±  0%   100.0m ±  0%        ~ (p=0.106 n=20)
Test/suite=main/script=store-info                                                70.00m ±  0%   65.00m ±  8%   -7.14% (p=0.002 n=20)
Test/suite=main/script=fetchClosure                                              310.0m ±  3%   290.0m ±  3%   -6.45% (p=0.000 n=20)
Test/suite=main/script=completions                                               310.0m ±  3%   290.0m ±  3%   -6.45% (p=0.000 n=20)
Test/suite=main/script=impure-derivations                                        430.0m ±  2%   400.0m ±  3%   -6.98% (p=0.000 n=20)
Test/suite=main/script=path-from-hash-part                                       80.00m ±  0%   80.00m ± 12%    0.00% (p=0.004 n=20)
Test/suite=main/script=path-info                                                 110.0m ±  0%   100.0m ± 10%   -9.09% (p=0.000 n=20)
Test/suite=main/script=toString-path                                             65.00m ±  8%   60.00m ±  0%   -7.69% (p=0.041 n=20)
Test/suite=main/script=read-only-store                                           200.0m ±  5%   180.0m ±  0%  -10.00% (p=0.000 n=20)
Test/suite=main/script=nested-sandboxing                                         755.0m ± 32%   880.0m ± 24%        ~ (p=0.784 n=20)
Test/suite=main/script=impure-env                                                820.0m ±  1%   810.0m ±  0%   -1.22% (p=0.030 n=20)
Test/suite=main/script=debugger                                                  80.00m ±  0%   80.00m ± 12%        ~ (p=0.052 n=20)
Test/suite=libstoreconsumer/script=test-libstoreconsumer                         80.00m ± 12%   80.00m ±  0%    0.00% (p=0.001 n=20)
Test/suite=plugins/script=plugins                                                50.00m ± 20%   50.00m ±  0%    0.00% (p=0.020 n=20)
Test/suite=ca/script=build-with-garbage-path                                     180.0m ±  6%   170.0m ±  6%   -5.56% (p=0.000 n=20)
Test/suite=ca/script=build                                                       565.0m ±  1%   520.0m ±  2%   -7.96% (p=0.000 n=20)
Test/suite=ca/script=build-cache                                                 460.0m ±  2%   430.0m ±  2%   -6.52% (p=0.000 n=20)
Test/suite=ca/script=concurrent-builds                                            10.07 ± 50%    10.07 ± 50%        ~ (p=0.413 n=20)
Test/suite=ca/script=derivation-json                                             150.0m ±  0%   140.0m ±  7%   -6.67% (p=0.001 n=20)
Test/suite=ca/script=duplicate-realisation-in-closure                             2.165 ±  0%    2.155 ±  0%   -0.46% (p=0.000 n=20)
Test/suite=ca/script=eval-store                                                  240.0m ±  0%   230.0m ±  4%   -4.17% (p=0.000 n=20)
Test/suite=ca/script=gc                                                          300.0m ±  3%   280.0m ±  4%   -6.67% (p=0.000 n=20)
Test/suite=ca/script=import-from-derivation                                      140.0m ±  0%   130.0m ±  0%   -7.14% (p=0.000 n=20)
Test/suite=ca/script=new-build-cmd                                               630.0m ±  3%   580.0m ±  2%   -7.94% (p=0.000 n=20)
Test/suite=ca/script=nix-copy                                                    580.0m ±  2%   550.0m ±  2%   -5.17% (p=0.000 n=20)
Test/suite=ca/script=nix-run                                                     80.00m ± 12%   80.00m ±  0%        ~ (p=0.235 n=20)
Test/suite=ca/script=nix-shell                                                   905.0m ±  2%   840.0m ±  2%   -7.18% (p=0.000 n=20)
Test/suite=ca/script=post-hook                                                   320.0m ±  0%   300.0m ±  3%   -6.25% (p=0.000 n=20)
Test/suite=ca/script=recursive                                                   310.0m ±  3%   280.0m ±  4%   -9.68% (p=0.000 n=20)
Test/suite=ca/script=repl                                                         2.015 ±  1%    1.965 ±  1%   -2.48% (p=0.000 n=20)
Test/suite=ca/script=selfref-gc                                                  90.00m ± 11%   90.00m ±  0%        ~ (p=0.171 n=20)
Test/suite=ca/script=signatures                                                  620.0m ±  3%   595.0m ±  1%   -4.03% (p=0.000 n=20)
Test/suite=ca/script=substitute                                                  640.0m ±  3%   610.0m ±  2%   -4.69% (p=0.000 n=20)
Test/suite=ca/script=why-depends                                                 190.0m ±  0%   180.0m ±  0%   -5.26% (p=0.000 n=20)
Test/suite=dyn-drv/script=text-hashed-output                                     150.0m ±  7%   140.0m ±  7%   -6.67% (p=0.000 n=20)
Test/suite=dyn-drv/script=recursive-mod-json                                     160.0m ±  0%   150.0m ±  0%   -6.25% (p=0.000 n=20)
Test/suite=dyn-drv/script=build-built-drv                                        105.0m ±  5%   100.0m ± 10%   -4.76% (p=0.000 n=20)
Test/suite=dyn-drv/script=eval-outputOf                                          160.0m ±  6%   140.0m ±  0%  -12.50% (p=0.000 n=20)
Test/suite=dyn-drv/script=dep-built-drv                                          90.00m ± 11%   85.00m ±  6%   -5.56% (p=0.000 n=20)
Test/suite=flakes/script=flakes                                                   2.135 ±  2%    1.955 ±  1%   -8.43% (p=0.000 n=20)
Test/suite=flakes/script=develop                                                 340.0m ±  3%   320.0m ±  3%   -5.88% (p=0.000 n=20)
Test/suite=flakes/script=edit                                                    80.00m ±  0%   70.00m ± 14%  -12.50% (p=0.000 n=20)
Test/suite=flakes/script=run                                                     160.0m ±  6%   150.0m ±  0%   -6.25% (p=0.000 n=20)
Test/suite=flakes/script=mercurial                                                4.420 ±  1%    4.390 ±  0%   -0.68% (p=0.028 n=20)
Test/suite=flakes/script=circular                                                145.0m ±  3%   140.0m ±  7%   -3.45% (p=0.005 n=20)
Test/suite=flakes/script=init                                                    330.0m ±  3%   310.0m ±  3%   -6.06% (p=0.000 n=20)
Test/suite=flakes/script=inputs                                                  110.0m ±  0%   100.0m ± 10%        ~ (p=0.052 n=20)
Test/suite=flakes/script=follow-paths                                            40.00m ±  0%   40.00m ±  0%        ~ (p=0.661 n=20)
Test/suite=flakes/script=bundle                                                  150.0m ±  7%   135.0m ±  4%  -10.00% (p=0.000 n=20)
Test/suite=flakes/script=check                                                   220.0m ±  0%   200.0m ±  5%   -9.09% (p=0.000 n=20)
Test/suite=flakes/script=unlocked-override                                       110.0m ±  9%   100.0m ±  0%   -9.09% (p=0.002 n=20)
Test/suite=flakes/script=absolute-paths                                          60.00m ± 17%   60.00m ±  0%    0.00% (p=0.008 n=20)
Test/suite=flakes/script=absolute-attr-paths                                     80.00m ± 12%   80.00m ±  0%    0.00% (p=0.006 n=20)
Test/suite=flakes/script=build-paths                                             290.0m ±  0%   270.0m ±  0%   -6.90% (p=0.000 n=20)
Test/suite=flakes/script=flake-in-submodule                                      200.0m ±  5%   190.0m ±  5%   -5.00% (p=0.000 n=20)
Test/suite=flakes/script=prefetch                                                60.00m ±  0%   50.00m ± 20%  -16.67% (p=0.001 n=20)
Test/suite=flakes/script=eval-cache                                              130.0m ±  8%   130.0m ±  8%    0.00% (p=0.017 n=20)
Test/suite=flakes/script=search-root                                             355.0m ±  1%   325.0m ±  5%   -8.45% (p=0.000 n=20)
Test/suite=flakes/script=config                                                  210.0m ±  5%   200.0m ±  0%   -4.76% (p=0.000 n=20)
Test/suite=flakes/script=show                                                    185.0m ±  3%   170.0m ±  0%   -8.11% (p=0.000 n=20)
Test/suite=flakes/script=dubious-query                                           160.0m ±  0%   150.0m ±  7%   -6.25% (p=0.011 n=20)
Test/suite=flakes/script=shebang                                                 220.0m ±  5%   205.0m ±  2%   -6.82% (p=0.000 n=20)
Test/suite=flakes/script=commit-lock-file-summary                                120.0m ±  0%   120.0m ±  8%    0.00% (p=0.001 n=20)
Test/suite=flakes/script=non-flake-inputs                                        390.0m ±  3%   365.0m ±  1%   -6.41% (p=0.000 n=20)
Test/suite=git-hashing/script=simple                                             285.0m ±  2%   270.0m ±  0%   -5.26% (p=0.000 n=20)
Test/suite=local-overlay-store/script=check-post-init                            320.0m ±  3%   300.0m ±  3%   -6.25% (p=0.000 n=20)
Test/suite=local-overlay-store/script=redundant-add                              170.0m ±  6%   150.0m ±  7%  -11.76% (p=0.000 n=20)
Test/suite=local-overlay-store/script=build                                      190.0m ±  5%   170.0m ±  0%  -10.53% (p=0.000 n=20)
Test/suite=local-overlay-store/script=bad-uris                                   70.00m ±  0%   70.00m ±  0%        ~ (p=0.342 n=20)
Test/suite=local-overlay-store/script=add-lower                                  160.0m ±  6%   150.0m ±  0%   -6.25% (p=0.000 n=20)
Test/suite=local-overlay-store/script=delete-refs                                340.0m ±  3%   330.0m ±  3%   -2.94% (p=0.000 n=20)
Test/suite=local-overlay-store/script=delete-duplicate                           170.0m ±  6%   170.0m ±  6%    0.00% (p=0.000 n=20)
Test/suite=local-overlay-store/script=gc                                         310.0m ±  3%   290.0m ±  3%   -6.45% (p=0.000 n=20)
Test/suite=local-overlay-store/script=verify                                     240.0m ±  4%   220.0m ±  5%   -8.33% (p=0.000 n=20)
Test/suite=local-overlay-store/script=optimise                                   180.0m ±  6%   170.0m ±  6%   -5.56% (p=0.000 n=20)
Test/suite=local-overlay-store/script=stale-file-handle                          460.0m ±  2%   435.0m ±  1%   -5.43% (p=0.000 n=20)
geomean                                                                          293.5m         277.2m         -5.52%

On my MacBook Pro (16-inch Nov 2023, Apple M3 Max, macOS Version 15.3 Beta (24D5034f)) I see a much smaller performance improvement of about 1-2%:

                                                         │ nix-functional-tests-old.txt │    nix-functional-tests-new.txt     │
                                                         │             s/op             │     s/op      vs base               │
Test/suite=main/script=test-infra                                          200.0m ±  5%   200.0m ±  5%       ~ (p=0.363 n=20)
Test/suite=main/script=gc                                                   1.270 ±  1%    1.250 ±  2%  -1.57% (p=0.005 n=20)
Test/suite=main/script=nix-collect-garbage-d                                1.745 ±  2%    1.720 ±  2%  -1.43% (p=0.007 n=20)
Test/suite=main/script=remote-store                                         6.945 ±  1%    6.880 ±  1%       ~ (p=0.066 n=20)
Test/suite=main/script=legacy-ssh-store                                    250.0m ±  0%   245.0m ±  2%       ~ (p=0.153 n=20)
Test/suite=main/script=lang                                                 8.915 ±  2%    8.695 ±  2%  -2.47% (p=0.002 n=20)
Test/suite=main/script=characterisation-test-infra                         130.0m ±  0%   130.0m ±  0%       ~ (p=0.890 n=20)
Test/suite=main/script=experimental-features                               710.0m ±  1%   690.0m ±  3%  -2.82% (p=0.003 n=20)
Test/suite=main/script=fetchMercurial                                       5.505 ±  2%    5.475 ±  2%       ~ (p=0.394 n=20)
Test/suite=main/script=gc-auto                                              10.84 ±  4%    10.79 ±  2%       ~ (p=0.804 n=20)
Test/suite=main/script=user-envs                                            5.530 ±  2%    5.445 ±  2%  -1.54% (p=0.005 n=20)
Test/suite=main/script=binary-cache                                         5.480 ±  2%    5.420 ±  1%  -1.09% (p=0.027 n=20)
Test/suite=main/script=multiple-outputs                                     1.720 ±  2%    1.710 ±  1%       ~ (p=0.053 n=20)
Test/suite=main/script=nix-build                                            2.080 ±  1%    2.070 ±  1%       ~ (p=0.188 n=20)
Test/suite=main/script=gc-concurrent                                        1.170 ±  2%    1.160 ±  3%       ~ (p=0.073 n=20)
Test/suite=main/script=repair                                               2.130 ±  1%    2.105 ±  1%       ~ (p=0.056 n=20)
Test/suite=main/script=fixed                                                9.400 ±  0%    9.405 ±  0%       ~ (p=0.604 n=20)
Test/suite=main/script=export-graph                                         2.135 ±  1%    2.130 ±  2%       ~ (p=0.232 n=20)
Test/suite=main/script=timeout                                              8.260 ±  0%    8.255 ±  0%       ~ (p=0.780 n=20)
Test/suite=main/script=fetchGitRefs                                         2.175 ±  1%    2.140 ±  4%  -1.61% (p=0.014 n=20)
Test/suite=main/script=tarball                                              2.755 ±  2%    2.710 ±  4%  -1.63% (p=0.038 n=20)
Test/suite=main/script=fetchGit                                             3.400 ±  1%    3.365 ±  2%  -1.03% (p=0.019 n=20)
Test/suite=main/script=fetchurl                                             1.495 ±  2%    1.480 ±  2%  -1.00% (p=0.016 n=20)
Test/suite=main/script=fetchPath                                           160.0m ±  6%   150.0m ±  7%       ~ (p=0.111 n=20)
Test/suite=main/script=fetchTree-file                                      360.0m ±  3%   355.0m ±  1%  -1.39% (p=0.005 n=20)
Test/suite=main/script=simple                                              490.0m ±  2%   480.0m ±  2%       ~ (p=0.155 n=20)
Test/suite=main/script=referrers                                           320.0m ±  3%   310.0m ±  3%       ~ (p=0.238 n=20)
Test/suite=main/script=optimise-store                                      750.0m ±  3%   740.0m ±  1%  -1.33% (p=0.018 n=20)
Test/suite=main/script=substitute-with-invalid-ca                          310.0m ±  0%   305.0m ±  2%  -1.61% (p=0.021 n=20)
Test/suite=main/script=signing                                              2.250 ±  2%    2.210 ±  2%  -1.78% (p=0.004 n=20)
Test/suite=main/script=hash-convert                                         3.095 ±  1%    3.010 ±  1%  -2.75% (p=0.001 n=20)
Test/suite=main/script=hash-path                                            2.380 ±  1%    2.320 ±  2%  -2.52% (p=0.001 n=20)
Test/suite=main/script=gc-non-blocking                                      3.745 ±  1%    3.730 ±  1%       ~ (p=0.753 n=20)
Test/suite=main/script=check                                                3.755 ±  2%    3.730 ±  2%       ~ (p=0.197 n=20)
Test/suite=main/script=nix-shell                                            4.370 ±  2%    4.340 ±  1%       ~ (p=0.095 n=20)
Test/suite=main/script=check-refs                                           2.910 ±  1%    2.850 ±  2%  -2.06% (p=0.014 n=20)
Test/suite=main/script=secure-drv-outputs                                  925.0m ±  5%   860.0m ±  7%       ~ (p=0.079 n=20)
Test/suite=main/script=restricted                                           1.190 ±  1%    1.170 ±  1%  -1.68% (p=0.013 n=20)
Test/suite=main/script=fetchGitSubmodules                                   1.965 ±  1%    1.940 ±  2%       ~ (p=0.160 n=20)
Test/suite=main/script=readfile-context                                    620.0m ±  3%   610.0m ±  5%       ~ (p=0.434 n=20)
Test/suite=main/script=nix-channel                                          2.340 ±  2%    2.305 ±  2%  -1.50% (p=0.046 n=20)
Test/suite=main/script=recursive                                           960.0m ± 14%   970.0m ± 11%       ~ (p=0.532 n=20)
Test/suite=main/script=dependencies                                         1.330 ±  1%    1.310 ±  4%  -1.50% (p=0.044 n=20)
Test/suite=main/script=check-reqs                                           1.990 ±  1%    1.965 ±  2%  -1.26% (p=0.047 n=20)
Test/suite=main/script=nar-access                                          740.0m ±  1%   730.0m ±  1%  -1.35% (p=0.026 n=20)
Test/suite=main/script=impure-eval                                         390.0m ±  3%   390.0m ±  3%       ~ (p=0.134 n=20)
Test/suite=main/script=pure-eval                                           550.0m ±  0%   540.0m ±  2%  -1.82% (p=0.002 n=20)
Test/suite=main/script=eval                                                860.0m ±  1%   845.0m ±  2%  -1.74% (p=0.006 n=20)
Test/suite=main/script=repl                                                 3.760 ±  1%    3.715 ±  2%       ~ (p=0.097 n=20)
Test/suite=main/script=binary-cache-build-remote                            11.13 ±  0%    11.12 ±  0%       ~ (p=0.145 n=20)
Test/suite=main/script=search                                              640.0m ±  2%   630.0m ±  2%  -1.56% (p=0.004 n=20)
Test/suite=main/script=logging                                              1.980 ±  2%    1.970 ±  2%       ~ (p=0.051 n=20)
Test/suite=main/script=export                                               1.100 ±  1%    1.090 ±  4%       ~ (p=0.190 n=20)
Test/suite=main/script=config                                              610.0m ±  2%   600.0m ±  2%  -1.64% (p=0.044 n=20)
Test/suite=main/script=add                                                 570.0m ±  2%   560.0m ±  5%       ~ (p=0.128 n=20)
Test/suite=main/script=chroot-store                                        250.0m ±  0%   250.0m ±  0%       ~ (p=0.452 n=20)
Test/suite=main/script=filter-source                                       320.0m ±  3%   320.0m ±  3%       ~ (p=0.979 n=20)
Test/suite=main/script=misc                                                510.0m ±  2%   500.0m ±  2%  -1.96% (p=0.025 n=20)
Test/suite=main/script=dump-db                                             955.0m ±  1%   945.0m ±  5%       ~ (p=0.470 n=20)
Test/suite=main/script=build-dry                                            1.800 ±  1%    1.790 ±  2%       ~ (p=0.358 n=20)
Test/suite=main/script=structured-attrs                                     1.175 ±  1%    1.160 ±  3%       ~ (p=0.440 n=20)
Test/suite=main/script=brotli                                              970.0m ±  1%   960.0m ±  2%       ~ (p=0.303 n=20)
Test/suite=main/script=zstd                                                950.0m ±  1%   940.0m ±  4%       ~ (p=0.685 n=20)
Test/suite=main/script=compression-levels                                  870.0m ±  1%   870.0m ±  2%       ~ (p=0.472 n=20)
Test/suite=main/script=nix-copy-ssh                                         1.085 ±  1%    1.080 ±  4%       ~ (p=0.200 n=20)
Test/suite=main/script=nix-copy-ssh-ng                                      2.005 ±  2%    1.990 ±  4%       ~ (p=0.072 n=20)
Test/suite=main/script=post-hook                                            1.690 ±  1%    1.680 ±  2%  -0.59% (p=0.016 n=20)
Test/suite=main/script=function-trace                                      300.0m ±  3%   290.0m ±  3%       ~ (p=0.119 n=20)
Test/suite=main/script=fmt                                                 610.0m ±  2%   600.0m ±  2%       ~ (p=0.299 n=20)
Test/suite=main/script=eval-store                                           2.455 ±  2%    2.440 ±  4%       ~ (p=0.662 n=20)
Test/suite=main/script=why-depends                                          1.020 ±  3%    1.000 ±  1%  -1.96% (p=0.026 n=20)
Test/suite=main/script=derivation-json                                     220.0m ±  0%   220.0m ±  5%   0.00% (p=0.009 n=20)
Test/suite=main/script=derivation-advanced-attributes                      180.0m ±  6%   170.0m ±  6%       ~ (p=0.143 n=20)
Test/suite=main/script=import-from-derivation                              980.0m ±  1%   965.0m ±  2%  -1.53% (p=0.037 n=20)
Test/suite=main/script=nix_path                                            810.0m ±  1%   800.0m ±  4%  -1.23% (p=0.046 n=20)
Test/suite=main/script=nars                                                 1.130 ±  1%    1.120 ±  3%       ~ (p=0.137 n=20)
Test/suite=main/script=placeholders                                        300.0m ±  0%   300.0m ±  3%       ~ (p=0.905 n=20)
Test/suite=main/script=ssh-relay                                           240.0m ±  0%   240.0m ±  0%       ~ (p=0.237 n=20)
Test/suite=main/script=build                                                2.630 ±  1%    2.595 ±  3%       ~ (p=0.079 n=20)
Test/suite=main/script=build-delete                                         1.780 ±  3%    1.760 ±  3%       ~ (p=0.098 n=20)
Test/suite=main/script=output-normalization                                300.0m ±  0%   300.0m ±  3%       ~ (p=0.614 n=20)
Test/suite=main/script=selfref-gc                                          450.0m ±  2%   450.0m ±  2%       ~ (p=0.868 n=20)
Test/suite=main/script=bash-profile                                        100.0m ± 10%   100.0m ± 10%       ~ (p=1.000 n=20)
Test/suite=main/script=pass-as-file                                        300.0m ±  3%   300.0m ±  3%       ~ (p=0.315 n=20)
Test/suite=main/script=nix-profile                                          5.340 ±  1%    5.265 ±  2%  -1.40% (p=0.026 n=20)
Test/suite=main/script=suggestions                                         250.0m ±  4%   240.0m ±  4%  -4.00% (p=0.029 n=20)
Test/suite=main/script=store-info                                          160.0m ±  0%   160.0m ±  6%       ~ (p=0.065 n=20)
Test/suite=main/script=fetchClosure                                         1.310 ±  2%    1.305 ±  2%       ~ (p=0.111 n=20)
Test/suite=main/script=completions                                         705.0m ±  2%   690.0m ±  1%  -2.13% (p=0.017 n=20)
Test/suite=main/script=impure-derivations                                   2.270 ±  1%    2.260 ±  2%       ~ (p=0.342 n=20)
Test/suite=main/script=path-from-hash-part                                 310.0m ±  3%   310.0m ±  3%       ~ (p=0.925 n=20)
Test/suite=main/script=path-info                                           250.0m ±  0%   240.0m ±  4%       ~ (p=0.073 n=20)
Test/suite=main/script=toString-path                                       160.0m ±  0%   155.0m ±  3%  -3.12% (p=0.038 n=20)
Test/suite=main/script=read-only-store                                     460.0m ±  2%   450.0m ±  2%  -2.17% (p=0.040 n=20)
Test/suite=main/script=impure-env                                           2.185 ± 12%    2.320 ± 11%       ~ (p=0.794 n=20)
Test/suite=main/script=debugger                                            200.0m ±  5%   200.0m ±  5%       ~ (p=0.811 n=20)
Test/suite=main/script=extra-sandbox-profile                               550.0m ±  2%   540.0m ±  2%       ~ (p=0.141 n=20)
Test/suite=libstoreconsumer/script=test-libstoreconsumer                   460.0m ±  2%   460.0m ±  2%       ~ (p=0.679 n=20)
Test/suite=plugins/script=plugins                                          260.0m ±  0%   260.0m ±  4%       ~ (p=0.430 n=20)
Test/suite=ca/script=build-with-garbage-path                               890.0m ±  3%   885.0m ±  2%       ~ (p=0.300 n=20)
Test/suite=ca/script=build                                                  3.790 ±  1%    3.745 ±  2%       ~ (p=0.154 n=20)
Test/suite=ca/script=build-cache                                            2.480 ±  1%    2.455 ±  3%       ~ (p=0.073 n=20)
Test/suite=ca/script=concurrent-builds                                      3.505 ± 49%    3.500 ±  0%       ~ (p=0.197 n=20)
Test/suite=ca/script=derivation-json                                       350.0m ±  3%   340.0m ±  0%       ~ (p=0.076 n=20)
Test/suite=ca/script=duplicate-realisation-in-closure                       3.270 ±  0%    3.260 ±  1%       ~ (p=0.536 n=20)
Test/suite=ca/script=eval-store                                             2.460 ±  2%    2.450 ±  3%       ~ (p=0.614 n=20)
Test/suite=ca/script=gc                                                     1.245 ±  1%    1.230 ±  1%       ~ (p=0.103 n=20)
Test/suite=ca/script=import-from-derivation                                790.0m ±  1%   790.0m ±  4%       ~ (p=0.459 n=20)
Test/suite=ca/script=new-build-cmd                                          2.625 ±  2%    2.615 ±  1%       ~ (p=0.089 n=20)
Test/suite=ca/script=nix-copy                                               3.110 ±  2%    3.080 ±  3%       ~ (p=0.261 n=20)
Test/suite=ca/script=nix-run                                               680.0m ±  1%   680.0m ±  3%       ~ (p=0.746 n=20)
Test/suite=ca/script=nix-shell                                              4.960 ±  1%    4.920 ±  2%       ~ (p=0.227 n=20)
Test/suite=ca/script=post-hook                                              1.700 ±  1%    1.690 ±  1%       ~ (p=0.240 n=20)
Test/suite=ca/script=recursive                                             940.0m ±  4%   990.0m ± 10%       ~ (p=0.315 n=20)
Test/suite=ca/script=repl                                                   3.790 ±  1%    3.750 ±  1%  -1.06% (p=0.015 n=20)
Test/suite=ca/script=selfref-gc                                            460.0m ±  2%   450.0m ±  2%       ~ (p=0.182 n=20)
Test/suite=ca/script=signatures                                             3.150 ±  1%    3.150 ±  1%       ~ (p=0.995 n=20)
Test/suite=ca/script=substitute                                             4.325 ±  1%    4.300 ±  2%       ~ (p=0.244 n=20)
Test/suite=ca/script=why-depends                                            1.030 ±  1%    1.020 ±  1%       ~ (p=0.303 n=20)
Test/suite=dyn-drv/script=text-hashed-output                               470.0m ±  2%   460.0m ±  4%       ~ (p=0.052 n=20)
Test/suite=dyn-drv/script=build-built-drv                                  370.0m ±  3%   360.0m ±  3%       ~ (p=0.124 n=20)
Test/suite=dyn-drv/script=eval-outputOf                                    350.0m ±  3%   345.0m ±  1%  -1.43% (p=0.012 n=20)
Test/suite=dyn-drv/script=dep-built-drv                                    340.0m ±  3%   335.0m ±  1%       ~ (p=0.264 n=20)
Test/suite=flakes/script=flakes                                             5.210 ±  1%    5.125 ±  3%  -1.63% (p=0.033 n=20)
Test/suite=flakes/script=develop                                            1.160 ±  2%    1.145 ±  1%       ~ (p=0.150 n=20)
Test/suite=flakes/script=edit                                              210.0m ±  5%   210.0m ±  0%       ~ (p=0.451 n=20)
Test/suite=flakes/script=run                                               950.0m ±  1%   940.0m ±  1%       ~ (p=0.183 n=20)
Test/suite=flakes/script=mercurial                                          5.115 ±  1%    5.135 ±  2%       ~ (p=0.424 n=20)
Test/suite=flakes/script=circular                                          385.0m ±  1%   380.0m ±  3%       ~ (p=0.674 n=20)
Test/suite=flakes/script=init                                              850.0m ±  1%   840.0m ±  2%       ~ (p=0.367 n=20)
Test/suite=flakes/script=inputs                                            400.0m ±  0%   400.0m ±  2%       ~ (p=0.994 n=20)
Test/suite=flakes/script=follow-paths                                      90.00m ±  0%   90.00m ±  0%       ~ (p=0.106 n=20)
Test/suite=flakes/script=bundle                                            450.0m ±  2%   450.0m ±  2%       ~ (p=0.717 n=20)
Test/suite=flakes/script=check                                             490.0m ±  2%   490.0m ±  2%       ~ (p=0.626 n=20)
Test/suite=flakes/script=unlocked-override                                 270.0m ±  4%   270.0m ±  4%       ~ (p=0.987 n=20)
Test/suite=flakes/script=absolute-paths                                    170.0m ±  6%   170.0m ±  0%       ~ (p=0.730 n=20)
Test/suite=flakes/script=absolute-attr-paths                               190.0m ±  5%   190.0m ±  0%       ~ (p=0.318 n=20)
Test/suite=flakes/script=build-paths                                       900.0m ±  4%   895.0m ±  2%       ~ (p=0.179 n=20)
Test/suite=flakes/script=flake-in-submodule                                550.0m ±  2%   545.0m ±  1%  -0.91% (p=0.028 n=20)
Test/suite=flakes/script=prefetch                                          140.0m ±  7%   140.0m ±  7%       ~ (p=0.966 n=20)
Test/suite=flakes/script=eval-cache                                        450.0m ±  4%   450.0m ±  2%       ~ (p=0.558 n=20)
Test/suite=flakes/script=search-root                                       940.0m ±  4%   935.0m ±  2%       ~ (p=0.360 n=20)
Test/suite=flakes/script=config                                             1.060 ±  2%    1.060 ±  1%       ~ (p=0.153 n=20)
Test/suite=flakes/script=show                                              440.0m ±  0%   430.0m ±  2%  -2.27% (p=0.013 n=20)
Test/suite=flakes/script=dubious-query                                     380.0m ±  3%   380.0m ±  3%       ~ (p=0.467 n=20)
Test/suite=flakes/script=shebang                                            1.545 ±  2%    1.560 ±  2%       ~ (p=0.529 n=20)
Test/suite=flakes/script=commit-lock-file-summary                          390.0m ±  3%   380.0m ±  3%       ~ (p=0.353 n=20)
Test/suite=flakes/script=non-flake-inputs                                   1.525 ±  2%    1.510 ±  4%       ~ (p=0.222 n=20)
Test/suite=git-hashing/script=simple                                       820.0m ±  2%   815.0m ±  2%       ~ (p=0.342 n=20)
geomean                                                                    961.3m         951.5m        -1.02%

In terms of establishing a benchmark suite... I think since benchstat allows as many units as desired, it'd be neat to flatten the data from NIX_SHOW_STATS to compare GC memory allocations. Likewise, including information about peak memory allocations, interrupts, etc. from GNU time.

I'm not familiar with Meson by any means, and I don't know what would be involved in making such data available, but I assume there's a better way than transforming the build log printed to stderr. I had thought about making a separate output for the functional tests which stored the test results log file produced by Meson, but I don't know how that would interact with content-addressed stores (since rebuilds would have different paths because the times in the file would be different).


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@github-actions github-actions bot added new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority fetching Networking with the outside (non-Nix) world, input locking c api Nix as a C library with a stable interface labels Dec 21, 2024
@Ericson2314
Copy link
Member

Won't this slow down debug builds?

@ConnorBaker
Copy link
Contributor Author

Won't this slow down debug builds?

In which sense? Speed of compilation, speed of execution, or something else?

I guess it could slow down compilation depending on how costly the level three optimizations are. I don’t know that LTO is a bad idea, generally.

I could refactor this to conditionally set optimization level to three when it isn’t a debug build, similar to what I removed in the refactor.

I’d also really appreciate any pointers you can give on reliably benchmarking Nix!

@MagicRB
Copy link
Contributor

MagicRB commented Dec 21, 2024

This should be 100% disabled for debug builds, optimization can make chabnes to the code which make debugging really hard, inlining, the infamous "where did my code go? right it was optimized away/unrolled"

@ConnorBaker
Copy link
Contributor Author

Should debug builds be built without any optimization? Because that’s not what happens currently :/

@MagicRB
Copy link
Contributor

MagicRB commented Dec 21, 2024

I'm not part of the Nix dev team, but in my experience having all optimizations off greatly helps staring at code in gdb/lldb. Since the machine code is closer to what's actually written the source code

@Ericson2314
Copy link
Member

Yes for debug builds we like less optimization both because it means faster builds, and because it means the debug symbols are easier to understand / closer to the code.

@ConnorBaker
Copy link
Contributor Author

@Ericson2314 so should debug builds happen without any optimization, or with optimization level two (which is what happens prior to this PR)?

@Ericson2314
Copy link
Member

I think without any optimization, right @edolstra?

We could have the full nix builds instead specify more optimization with mesonFlags I think?

@Mic92
Copy link
Member

Mic92 commented Dec 25, 2024

@Ericson2314 so should debug builds happen without any optimization, or with optimization level two (which is what happens prior to this PR)?

We should use -Og rather than -O0. It's optimization with debugger in mind.

@ConnorBaker ConnorBaker changed the title meson: use optimization level 3 and LTO by default packaging: use optimization level 3 and LTO by default Dec 29, 2024
@ConnorBaker ConnorBaker changed the title packaging: use optimization level 3 and LTO by default packaging: use release builds and LTO by default Dec 29, 2024
@ConnorBaker
Copy link
Contributor Author

Updated and force-pushed.

Removed use of debug and optimization explicitly in project defaults -- these should be handled by Meson's buildtype, not us.

LTO is enabled when the build type is release or minsize and disabled otherwise.

Added links to relevant references describing Meson build types and the default build type as set by Nixpkgs' setup hook for Meson.

@ConnorBaker
Copy link
Contributor Author

Updated with information about how I'm approaching "benchmarking."

@ConnorBaker
Copy link
Contributor Author

LTO on Darwin was broken until fairly recently, so that explains that.
I guess I conditionally disable it and make a note.

@Ericson2314
Copy link
Member

@ConnorBaker thanks for your work, I like the looks of this a lot now. Yeah better do that macOS disable for now (unless you know how to fix) but then let's merge it!

@ConnorBaker
Copy link
Contributor Author

@ConnorBaker thanks for your work, I like the looks of this a lot now. Yeah better do that macOS disable for now (unless you know how to fix) but then let's merge it!

Fixing it would require a bump to our pin of Nixpkgs — is it safe to say that would need to be a separate PR with more testing?

@Ericson2314
Copy link
Member

@ConnorBaker bump to a newer 24.11 or something else?

@ConnorBaker
Copy link
Contributor Author

I rebased and force-pushed; I think the locked version master is using for release-24.11 is new enough.
I’m re-running the benchmarks (including on my Mac, we’ll see how that goes) right now!

@ConnorBaker
Copy link
Contributor Author

Updated the benchmarks; performance gain on my i9-13900k is basically unchanged (still 5-10%) and on macOS I'm seeing a 1-2% gain.

@Ericson2314 I'm happy with this now and CI is green; anything else we need to do?

@ConnorBaker
Copy link
Contributor Author

@Radvendii I remember we had discussed a benchmark suite for Nix at some point as a way to establish a baseline and collect metrics; you may be interested in what I in the PR description. It's super gross, but that sort of functionality would be really, really helpful (and avoid people trying to roll their own suites for performance improving PRs)!

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

@Ericson2314 Ericson2314 merged commit 442a262 into NixOS:master Jan 2, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c api Nix as a C library with a stable interface fetching Networking with the outside (non-Nix) world, input locking new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants