-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation #1
Comments
When benchmarking local changes, I also find |
I think we should also have guidelines for benchmarks:
|
Another issue: which timer function should be used? asv's default timer may not be adequate: Should we measure CPU time or wallclock time? IMHO we should measure wallclock time: if dask or distributed schedules tasks inefficiently and doesn't make full use of the CPU, it's a problem that should appear in the benchmark results. |
@TomAugspurger I'm interested in helping with this, partly as a way to become more familiar with the dask API. Is there anything in particular you would prefer me to target, to start? |
@danielballan great, thanks! I'm guessing that @mrocklin, @jcrist, and Antoine have the most knowledge on which parts of dask would be best to benchmark. My current thinking is that we'll have two kinds of benchmarks: The first are higher-level benchmarks that hit things like top-level methods on I think the first kind will be easier to write benchmarks for as you learn the library (that's true for me anyway. ATM I have no idea how to write a good benchmark for something in |
I agree with @TomAugspurger 's classification of high-level external benchmarks and internal ones. I also agree that high-level external benchmarks are probably both the more useful and the more approachable. Actually, I'm curious if, as with all things, we can steal from Pandas a bit here. Are there benchmarks in Pandas that are appropriate to take? There are some extreme things we can test as well, such as doing groupby-applies with small dask dataframes with 1000 partitions, or calling
These should be good to stress the administrative side. |
Other question: I see a couple of existing benchmarks parameterize on the |
@pitrou for a bit, I was thinking these benchmarks could be helpful for users to see the overall performance characteristics of the various backends across different workloads. In hindsight it's probably best to keep this strictly for devs. I'll send along a PR to remove those when I get a chance. Been swamped lately. |
This is a sketch for some sections of documentation that should go in the README.
What to test?
Ideally, benchmarks measure how long our project (dask, distributed) spends doing something, not the underlying libraries they're built on. We want to limit the variance across runs to just code we control.
For example, I suspect
(self.data.a > 0).compute()
is not a great benchmark. My guess (without having profiled) is that the.compute
part takes the majority of the time, most of which would be in pandas / NumPy. (I need to profile all these. I'm reading through dask now to find places where dask is doing a lot of work.)Benchmarking new Code
If you're writing an optimization, say, you can benchmark it by
benchmarks/
repo
field inasv.conf.json
to the path of your dask / distributed repository on your local file systemasv continuous -f 1.1 upstream/master HEAD
(optionally with a regex-b <regex>
to filter to just your benchmark.Naming Conventions
Directory Structure
This repository contains benchmarks for several dask related projects.
Each project needs it's own benchmark directory because
asv
is built aroundone configuration file (
asv.conf.json
) and benchmark suite per repository.The text was updated successfully, but these errors were encountered: