Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize benchmark code of arctic #545

Open
dimosped opened this issue Apr 20, 2018 · 0 comments
Open

Standardize benchmark code of arctic #545

dimosped opened this issue Apr 20, 2018 · 0 comments

Comments

@dimosped
Copy link
Contributor

dimosped commented Apr 20, 2018

We currently we have multiple unrelated benchmarks for various scenarios:

  • generic Arctic top level calls
  • draft Arctic breakdown solution for keeping track of where time goes ((de)compress, numpy, serialization, MongoDB IO)
  • draft Arrow serialization benchmarks

The goal is to create a standard API for benchmarks:

  • requirements

    • specify experiement scenarios in an easy way (e.g. DSL or just a dict for fixed steps)
    • collection of results
    • plotting
    • break down to components (e.g. compress, numpy object creation, serialization, mongo IO)
    • make sure that when benchmark mode is disabled no impact on performance
    • reproducible benchmarks
  • goals

    • understand our code's bottlenecks
    • have a standard way to perform and repeat benchmarks

A skeleton of benchmarks exists in the top level directory, benchmarks.
There are some very basic examples and a readme (https://github.com/manahl/arctic/blob/master/benchmarks.md), but these should be expanded upon to include all the storage engines and some more involved use cases and examples (i.e. chunkstore with numerics only, vs chunkstore with strings, version store with pickled objects, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants