wrappi is a C++ library for core events profiling based on PAPI hardware counters.
It is targetted for multicore and manycore compute nodes endowed with a Linux kernel.
It provides a simple and clean interface to retrieve:
- core cycles,
- cache hits/misses ratios,
- branch prediction,
- translation lookaside buffer misses,
- instructions load/store.
on a given section of the code, or for a set of compute kernels.
wrappi is almost standalone.
It requires a C++14 compiler endowed with OpenMP.
It can be built on any Linux distribution with PAPI installed, using CMake:
mkdir build # out-of-source build recommended
cd build #
cmake .. # CMAKE_BUILD_TYPE=[Debug|Release]
make -j4 # use multiple jobs for compilation
make install # optional, can use a prefix
wrappi is exported as a package.
To enable it, please update your CMakeLists.txt with:
find_package(wrappi) # in build or install tree
target_link_libraries(target PRIVATE wrappi) # replace 'target'
And then include wrappi.h
in your application.
In a multicore context, you have to explicitly set thread-core affinity before profiling.
Indeed, threads should be bind to physical cores to prevent the OS from migrating them.
Besides, simultaneous multithreading (or hyperthreading on Intel) should be disabled in this case.
It can be done by setting some environment variables:
export OMP_PLACES=core OMP_PROC_BIND=close # with Gnu or LLVM compiler
export KMP_AFFINITY=[granularity=core,compact] # with Intel compiler
wrappi was designed with simplicity and ease of use in mind.
It can retrieve stats on each invidual core as well as for all cores.
Here is a basic usage:
using namespace wrappi;
Manager profile(Mode::Cache, nb);
for (int i = 0; i < nb; ++i) {
profile.start(i);
kernel[i].run();
profile.stop(i);
}
profile.dump();
You can profile cycles, caches, instructions, TLB, or any event supported by PAPI as well.
wrappi extends the initial work of Sean Chester.
Improvements are welcome.
To get involved, you can:
- report bugs or request features by submitting an issue.
- submit code contributions using feature branches and pull requests.
Enjoy! 😊