Refactor, extend benchmarks and incorporate into CI #4882

glwagner · 2025-10-24T13:00:36Z

This PR refactors the /benchmark directory and incorporates the benchmark scripts into CI. It removes test/benchmark_tests.jl.

Our ultimate objective is to run a curated set of benchmarks regularly, and develop a graphic that tracks the evolution of the benchmarks over time and as performance optimization work is incorporated. This will ensure that the benchmark scripts stay up to date with current API. The benchmark scripts should also be runnable standalone for manual benchmarking. Eventually, we would also like to upload profile artifacts for inspection.

For now, I have moved the existing scripts in benchmarks into an "archive" folder. I think we should delete these, since they will be superceded. However, one question is how/whether we should also incorporate the code in benchmarks/src. I think this code is useful and nice for graphical display, but I also think there is a benefit to having simple benchmarks that consists as single scripts. @ali-ramadhan perhaps you can weigh in here since you developed that code originally.

cc @simone-silvestri @giordano

ali-ramadhan · 2025-10-24T22:31:45Z

Love this! Most of the code in https://github.com/CliMA/Oceananigans.jl/blob/main/benchmark/src/Benchmarks.jl is to help keep the benchmark scripts short so it'll probably still be useful if we plan on having multiple types of benchmarks.

Some things to think about:

There are a lot of benchmarks in https://github.com/CliMA/Oceananigans.jl/tree/main/benchmark Which are still worth benchmarking and which are worth benchmarking regularly?
To track the evolution of the benchmarks, will benchmarks be stored somewhere to be compared with future benchmarks?
How clean and reproducible do we want the benchmarks to be? Ideally benchmarks will be run on the same machine with the same GPU with no other processes utilizing the CPU significantly or consuming significant amounts of memory. Benchmarking on CI servers like Nautilus may not produce the cleanest benchmarks but maybe if we include the CPU and memory utilization for context then the benchmarks can be interpreted accordingly.

glwagner · 2025-10-25T01:01:35Z

There are a lot of benchmarks in https://github.com/CliMA/Oceananigans.jl/tree/main/benchmark Which are still worth benchmarking and which are worth benchmarking regularly?

I am hoping we can have a system that allows us to benchmark all important situations with just a few scripts which can be run in CI to ensure they do not go stale. Do you think the existing benchmarks (the ones we care about) will incorporate into a new framework?

glwagner added 2 commits October 24, 2025 06:55

reorganize benchmarks

7226796

rm benchmark tests

75def37

glwagner added the benchmark performance runs preconfigured benchamarks and spits out timing label Oct 24, 2025

glwagner added 4 commits October 24, 2025 09:37

move env to archive

bf55cc8

fixes

59978fb

fix warning

80668bb

use ab2 by default

36fe561

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor, extend benchmarks and incorporate into CI #4882

Refactor, extend benchmarks and incorporate into CI #4882

Uh oh!

glwagner commented Oct 24, 2025

Uh oh!

ali-ramadhan commented Oct 24, 2025

Uh oh!

glwagner commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor, extend benchmarks and incorporate into CI #4882

Are you sure you want to change the base?

Refactor, extend benchmarks and incorporate into CI #4882

Uh oh!

Conversation

glwagner commented Oct 24, 2025

Uh oh!

ali-ramadhan commented Oct 24, 2025

Uh oh!

glwagner commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants