Continuous benchmarking of workload motifs #704
Replies: 4 comments
-
The Hail team has a nice benchmark suite that we may want to borrow from. |
Beta Was this translation helpful? Give feedback.
-
There are many Dask benchmark suites that may also provide inspiration: |
Beta Was this translation helpful? Give feedback.
-
And of course the Scalable Linear Algebra Benchmark suite from A comparative evaluation of systems for scalable linear algebra-based analytics (2018). |
Beta Was this translation helpful? Give feedback.
-
(Posted by @jeromekelleher) Great idea. For genetic variation data, I'd suggest using simulations from stdpopsim, so that we can capture expected patterns of variation across a few different species. We can also easily simulate very large datasets, so we can run benchmarks at scale without having to ship around lots of data. Converting from the tskit tree sequence output from stdpopsim to Python genotype arrays is straightforward using the variants method. I'm happy to help with setting this up. |
Beta Was this translation helpful? Give feedback.
-
Once we have a good understanding of the workload we'd like our toolkit to handle, it would be useful to extract a small set of synthetic tasks that capture the performance-critical aspects of our workload. We can run these tasks regularly and capture those results over time.
Note that these kinds of tasks are often called "dwarves", but they were originally called "motifs", and I'd prefer to use that term as it's less offensive.
Beta Was this translation helpful? Give feedback.
All reactions