-
Notifications
You must be signed in to change notification settings - Fork 112
ArcticDB Performance Profiling
This page aims to describe different methods of profiling the performance of ArcticDB.
Using the existing benchmarks and adding new ones is a great new to track the performance of various parts of ArcticDB over time. For more information regarding ASV Benchmarking, please refer to this page in the Wiki.
Py-spy is a great tool to profile python code/libraries that relies on C++ extensions. To install it, run:
pip install py-spy
To access the full suite of options, it is recommended to use it on Linux and to create a speedscope file as output. You should also compile the library in debug for most representative profiling results.
To profile the code, you will need to profile by pid, like so:
python test_script.py & # this will spawn python in a new process and will output the pid of that process
py-spy record --format speedscope \ # set the format to speedscope
--output test.json \ # set the name of the output file with the profiling info
--native \ # set the native flag to get the profiling date from the native aka cpp function calls
--idle \ # set the idle flag to capture data for idle threads
--pid 1111 \ # set the pid of the python process that you want to profile
To examine the output, simply pass the outputed json file to speedscope (BEWARE: Don't upload profiles in the website if they're using sensitive data. You can run locally instead, see here). You can use this example json to play around with speedscope. You can view the output in different ways to get a different picture of the performance:
- Time Order - shows the function calls over time
- Left Heavy - group similar call stacks together
- Sandwich - show the most time consuming functions
We have functionality that samples various parts of the CPP portion of ArcticDB. Those samples produce logs to stdout, which detail how much time various functions took.
By default, this functionality is disabled and requires a recompilation to be enabled.
To enable it, add add_compile_definitions(ARCTICDB_LOG_PERFORMANCE)
to cpp/CMakeLists.txt and recompile.
On the following run, logs from the samples will be printed to stdout.
To add new samples, refer to how ARCTICDB_SAMPLE
and ARCTICDB_SUBSAMPLE
are used through the code.
This section details how to profile the code in the linux dev container (IDE agnostic).
- Enable the
linux-release - build-release
CMake profile, and build the arcticdb_ext target with this profile. - Ensure the symlink in the
python
directory points to the release .so fileln -s ../cpp/out/linux-release-build/arcticdb/arcticdb_ext.cpython-36m-x86_64-linux-gnu.so
- Clone the FlameGraph project into
/opt
- From the
python
directory, runperf record -g --call-graph="dwarf" python <path to Python script to profile> && perf script | /opt/FlameGraph/[stackcollapse-perf.pl](http://stackcollapse-perf.pl/) > /tmp/out.perf-folded && /opt/FlameGraph/[flamegraph.pl](http://flamegraph.pl/) /tmp/out.perf-folded > flamegraph-$(date -Iseconds).svg
- Open the resulting SVG file
Notes:
-
--call-graph="fp"
gives less detail, but does result in much smaller (GB → MB) perf.data files. - The x-axis in the resulting SVG is not time. When profiling time spent in either the CPU or IO thread pools, each thread in the pool will have its own section.
ArcticDB Wiki