GitHub PR: benchmark diff #123

srid · 2022-01-17T15:49:50Z

Produce reports like this for PRs in this repo, so that we can assess the impact of the PRs on benchmarks (code size, cpu & men). Take bench.csv produced by Hercules Effects for both commit A (commit where the PR branch branched off) and commit B (PR commit), and diff them.

#87 is an example of such PR which improved the types:builtin:intlist:plistEquals benchmark.

Old

types:builtin:intlist:plistEquals:==(n=3)            25843624(cpu)  59392(mem)  180(bytes)
types:builtin:intlist:plistEquals:/=(n=4)            12939651(cpu)  30928(mem)  181(bytes)
types:builtin:intlist:plistEquals:/=(empty;n=3)       7497147(cpu)  19064(mem)  177(bytes)

New

types:builtin:intlist:plistEquals:==(n=3)             9398080(cpu)  20846(mem)  95(bytes)
types:builtin:intlist:plistEquals:/=(n=4)             9398080(cpu)  20846(mem)  96(bytes)
types:builtin:intlist:plistEquals:/=(empty;n=3)       1639885(cpu)   4664(mem)  92(bytes)

Diff'ing these two tables, we would add a comment to the PR showing the number differences for each row/column much like the Plutus report. In the Plutus repo, benchmark reporting is manually triggered by adding a PR comment containing /benchmark.

The text was updated successfully, but these errors were encountered:

emiflake · 2022-01-18T21:30:58Z

I was looking into implementing this. With Hercules, how would we create artifacts that we can use as means for comparison? It doesn't seem like Hercules publishes the bench.csv anywhere, or am I missing something?

srid · 2022-01-18T21:59:27Z

@nixinator @MatthewCroughan Could one of you help Emily here?

The nix flake check writes a file called bench.csv - we want Hercules to upload it somewhere. The check is run here: https://hercules-ci.com/accounts/github/Plutonomicon/derivations/%2Fnix%2Fstore%2F1nzsljyb7yv6fhjgg1p34kz1acrnm568-benchmark.drv/log?via-job=b6899b8f-ead0-43f6-90e0-5fba98df65c8

I vaguely recall seeing Hercules effects supporting artifact upload. Perhaps we can use that.

MatthewCroughan · 2022-01-18T22:46:14Z

We do not need to write the CSV in order to accomplish this. Doing so would not be a good benchmark either, as the benchmark might be effected by different processes running on the CI agent at a future time. Instead, we can simply perform two benchmarks in a single job, using two different revisions of the source code, which generates a new CSV, which is the diff of the two. None of this requires writing files to any location, it is just more stdout.

srid · 2022-01-18T22:51:53Z

the benchmark might be effected by different processes running on the CI agent at a future time.

Usually, yes - but for this project we don't care about that, because the specific benchmark metrics (code size, cpu/mem "budget") are calculated independent of the running machine's performance.

srid · 2022-01-19T23:40:37Z

Part of work: #144

srid · 2022-01-24T21:51:21Z

@emiflake @MatthewCroughan How do I get to the benchmark diff for a PR, say #173 ?

emiflake · 2022-01-24T22:47:07Z

How do I get to the benchmark diff for a PR

Hah, this is what I've been asking. But AFAICT we just need to drop the runIf. I'll quickly see if it works on a non-PR as well, if so I'll make a PR for it :)

TotallyNotChase · 2022-02-23T09:34:44Z

@srid is this still relevant?

MatthewCroughan linked a pull request Jan 19, 2022 that will close this issue

PR Benchmark Diffing #144

Merged

emiflake mentioned this issue Jan 20, 2022

plutarch-benchmark with perf-diff binary #146

Merged

emiflake mentioned this issue Jan 24, 2022

flake: always run benchmark-diff effect #178

Merged

srid mentioned this issue Feb 3, 2022

Expand benchmarks to include all tests #216

Closed

L-as closed this as completed Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub PR: benchmark diff #123

GitHub PR: benchmark diff #123

srid commented Jan 17, 2022 •

edited

Loading

emiflake commented Jan 18, 2022

srid commented Jan 18, 2022

MatthewCroughan commented Jan 18, 2022 •

edited

Loading

srid commented Jan 18, 2022

srid commented Jan 19, 2022 •

edited

Loading

srid commented Jan 24, 2022

emiflake commented Jan 24, 2022

TotallyNotChase commented Feb 23, 2022

GitHub PR: benchmark diff #123

GitHub PR: benchmark diff #123

Comments

srid commented Jan 17, 2022 • edited Loading

emiflake commented Jan 18, 2022

srid commented Jan 18, 2022

MatthewCroughan commented Jan 18, 2022 • edited Loading

srid commented Jan 18, 2022

srid commented Jan 19, 2022 • edited Loading

srid commented Jan 24, 2022

emiflake commented Jan 24, 2022

TotallyNotChase commented Feb 23, 2022

srid commented Jan 17, 2022 •

edited

Loading

MatthewCroughan commented Jan 18, 2022 •

edited

Loading

srid commented Jan 19, 2022 •

edited

Loading