Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document benchmark process #66

Closed
jlperla opened this issue Jul 16, 2018 · 9 comments
Closed

Document benchmark process #66

jlperla opened this issue Jul 16, 2018 · 9 comments
Assignees

Comments

@jlperla
Copy link
Contributor

jlperla commented Jul 16, 2018

I put a placeholder in https://github.com/econtoolkit/tutorials/blob/master/julia/benchmark_regressions.md which shows a basic approach, but I think it is just a starting point.

In particular, https://github.com/econtoolkit/tutorials/blob/master/julia/benchmark_regressions.md#using-and-serializing-benchmark-groups is a placeholder to put the "real" way to do it.

  • My main concern in this is keeping track of many benchmarks, as I don't think that saving a bunch of JSON files is the way to go. IT is just too errorprone to load up and compare the wrong files
  • My guess is that this is BenchmarkGroups can be done in a very simple way (see https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/doc/manual.md#the-benchmarkgroup-type ) to keep track of these things, run, and compare
  • See https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/doc/manual.md#trialratio-and-trialjudgement but maybe there is a way to just get the trial ratio/judgement for an entire benchmark group
  • Then the idea is you would define the benchmarkgroup, run it, load the whole benchmark results from the disk, and run the judgement.
  • It is important that when we store the benchmarkresults of whatever form, we do not want to store the whole JSON for everything.... which is too much data. The Trial or TrialEstimate of the median, or something like that, is all we really want.
  • We should figure out the workflow of this for github (since we obviously don't want to overwrite the file everytime we run it). Maybe having a binary toggle which says overwritebenchmarks, which takes the lastest run of the benchmark group and just saves overtop of the current file?
  • You can ask around to see if there is a canonical place to put these files. Maybe test/runbenchmarks.jl? You definetely do not want this to run inside of the CI, so runtests.jl should not call it.

My hope is that we can keep this as simple as possible so that we can teach everyone to do it in their own projects.

Where to do it:

@arnavs
Copy link
Collaborator

arnavs commented Jul 19, 2018

@jlperla I took a stab at this. It could be more tedious than you'd like (i.e., boring/wordy for some students) but I wanted to err on the side of helping noobs. If it's too much, just let me know what to cut.

Points re the above:

  • Functions like median(), which are ordinarily only defined on Trials, are also defined on BenchmarkGroup objects. However, they just map the individual fields of the BenchmarkGroup to their values under the statistic. I can't see a way around condensing the whole group into one statistic without writing a recursive function to trawl the dictionary and do the job. Let me know if you'd like me to take a shot at that.

  • Regarding GitHub- I think that's a good approach. Another way to do this is to have one file runbenchmarks.jl which runs and writes the output to the terminal (STDOUT or whatever it's called), and another file savebenchmarks.jl which runs and writes to a file. We could provide a standard template for people to fill in.

@arnavs
Copy link
Collaborator

arnavs commented Jul 19, 2018

Side note: I think one issue is making sure that a given JSON/results file corresponds to a given state of the package. Otherwise, you can’t answer “what am I comparing?”

I wonder how these guys solve it. This could be a good way to get people to start versioning their personal projects.

@arnavs
Copy link
Collaborator

arnavs commented Jul 19, 2018

@jlperla Removed things from tutorial as we discussed (initial "quick and dirty" section, and subsetting). And added a small worked example in the ubcecon/julia_sandbox/benchmarking_example folder. So I'll close this, but if there are any issues, then we can re-open it.

@arnavs arnavs closed this as completed Jul 19, 2018
@jlperla
Copy link
Contributor Author

jlperla commented Jul 19, 2018

After getting the runbenchmarks.jl template file in the document, as I sketched out. Maybe add in Expectations.jl as a test with just a few benchmarks (not a full suite) as an example

@jlperla jlperla reopened this Jul 19, 2018
@arnavs
Copy link
Collaborator

arnavs commented Jul 19, 2018

Done.

@jlperla
Copy link
Contributor Author

jlperla commented Jul 19, 2018

I think this is completed? Although for what it is worth, I think that https://github.com/econtoolkit/Expectations.jl/blob/master/test/runbenchmarks.jl#L10 is supposed to have the $E1 to splice it in, or the expectation betcomes a global?

I wonder if it is just easier to tell people to create mini functions most of the time for these sorts of benchmarks...

@jlperla jlperla closed this as completed Jul 19, 2018
@arnavs
Copy link
Collaborator

arnavs commented Jul 19, 2018

@jlperla You know, I was debating splicing in the expectations. I figured it was okay for the same reason that functions like squared are OK (I think there are examples of that stuff in the docs for BenchmarkTools), and that too much splicing would make it look weird. Can add it in though.

@jlperla
Copy link
Contributor Author

jlperla commented Jul 19, 2018

I think it needs to be spliced because it isn't actually a function... it just looks like one.

@arnavs
Copy link
Collaborator

arnavs commented Jul 19, 2018

You're right. Callable objects... I'll update the docs to add that in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants