Skip to content

Commit bbdf0e2

Browse files
joanaxcruzblapie
authored andcommitted
Integrate Google benchmarks into SLEEF
Added new benchmark tool to SLEEF project using googlebench framework. In theory this tool can benchmark any unary and binary function in SLEEF. Benchmark is enabled in all functions listed in benchsleef.cpp. This list uses macros in benchmark_callers.cpp, so that we can enable benchmarking in multiple precisions using single lines of code. It is also possible to list the exact function we want, as each of the macros in benchmark_callers.cpp can be called independently and connected like building blocks. The tool is integrated with SLEEF via CMake, meaning it can be built automatically when SLEEF is built. To enable that, pass CMake argument -DSLEEF_BUILD_BENCH=ON. This tool depends on C++17 standard. Tested on aarch64 for scalar, vector and SVE routines Tested on x86 for different vector length extensions. Tested for llvm-17, gcc-11 and gcc-14.
1 parent 686f2ce commit bbdf0e2

File tree

11 files changed

+612
-2
lines changed

11 files changed

+612
-2
lines changed

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ option(SLEEF_BUILD_QUAD "libsleefquad will be built." OFF)
1313
option(SLEEF_BUILD_GNUABI_LIBS "libsleefgnuabi will be built." ON)
1414
option(SLEEF_BUILD_SCALAR_LIB "libsleefscalar will be built." OFF)
1515
option(SLEEF_BUILD_TESTS "Tests will be built." ON)
16+
option(SLEEF_BUILD_BENCH "Bench will be built." OFF)
1617
option(SLEEF_BUILD_INLINE_HEADERS "Build header for inlining whole SLEEF functions" OFF)
1718

1819
option(SLEEF_TEST_ALL_IUT "Perform tests on implementations with all vector extensions" OFF)

docs/1-user-guide/build-with-cmake/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,10 @@ optimized, or any other special set of flags.
162162
- `SLEEF_LLVM_AR_COMMAND` : Specify LLVM AR command when you build the library with thinLTO support with clang.
163163
- `SLEEF_ENABLE_LLVM_BITCODE` : Generate LLVM bitcode
164164

165+
### Benchmarks
166+
167+
- `SLEEF_BUILD_BENCH` : Build benchmark tool if set to TRUE
168+
165169
### Tests
166170

167171
- `SLEEF_BUILD_TESTS` : Avoid building testing tools if set to FALSE

docs/4-tools/README.md

Lines changed: 54 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ In some cases, it is desirable to fix the last few coefficients to values like
9292

9393
Finding a set of good parameters is not a straightforward process.
9494

95-
<h2 id="benchmark">Benchmarking tool</h2>
95+
<h2 id="benchmark-legacy">Legacy Benchmarking tool</h2>
9696

9797
SLEEF has a tool for measuring and plotting execution time of each function in
9898
the library. It consists of an executable for measurements, a makefile for
@@ -162,7 +162,7 @@ Install these with:
162162
```sh
163163
sudo apt install openjdk-19-jdk-headless
164164
```
165-
and
165+
and
166166
```sh
167167
sudo apt install gnuplot
168168
```
@@ -171,3 +171,55 @@ Four graphs are generated : trigdp.png, nontrigdp.png, trigsp.png
171171
and nontrigsp.png. Please see our [benchmark results](../5-performance/) for
172172
an example of generated graphs by this tool.
173173

174+
<h2 id="benchmark">Benchmarking tool</h2>
175+
176+
This tool uses the [googlebench](https://github.com/google/benchmark) framework to benchmark SLEEF
177+
functions.
178+
It is integrated with SLEEF via CMake.
179+
In order to build this tool automatically when SLEEF is
180+
built, pass the `-DSLEEF_BUILD_BENCH=ON` CMake option when
181+
setting up the build directory:
182+
```sh
183+
cmake -S . -B build -DSLEEF_BUILD_BENCH=ON
184+
```
185+
After building SLEEF:
186+
```sh
187+
cmake --build build -j
188+
```
189+
in `build/bin` folder you will find an executable named
190+
benchsleef128.
191+
Run this executable with `./build/bin/benchsleef128` in
192+
order to obtain microbenchmarks for the functions in the project.
193+
A filter option can also be provided to the executable.
194+
This feature in inherited from googlebench, and takes
195+
a regular expression, and executes only the benchmarks
196+
whose name matches the regular expression.
197+
The set of all the benchmarks available can be obtained
198+
when running the benchmark tool when no filter is set
199+
and corresponds to all the benchmarks listed in
200+
`benchsleef.cpp`.
201+
```sh
202+
# Examples:
203+
# * This will benchmark Sleef_sinf_u10 on all intervals enabled in the tool.
204+
./build/bin/benchsleef128 --benchmark_filter=sinf_u10
205+
# * This will benchmark all single precision sin functions (scalar, vector and sve if available):
206+
./build/bin/benchsleef128 --benchmark_filter=sinf
207+
# * This will benchmark all single precision vector functions:
208+
./build/bin/benchsleef128 --benchmark_filter=vectorf
209+
```
210+
Note: all corresponds to all functions available in SLEEF and enabled in the benchmarks in this context.
211+
<h3 id="benchmark">Benchmarking on aarch64</h3>
212+
If you're running SLEEF on a machine with SVE support the executable generated will have SVE benchmarks
213+
available for functions specified in `benchsleef.cpp`.
214+
<h3 id="benchmark">Benchmarking on x86</h3>
215+
If you're running SLEEF on an x86 machine, two extra
216+
executables may be built (according to feature detection):
217+
218+
```sh
219+
./build/bin/benchsleef256
220+
./build/bin/benchsleef512
221+
```
222+
These will benchmark 256bit and 512bit vector implementations
223+
for vector functions respectively.
224+
Note these executables can also be used to benchmark scalar
225+
functions.

src/CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,10 @@ if (SLEEF_BUILD_TESTS AND NOT MINGW)
77
endif()
88
add_subdirectory("common")
99

10+
if (SLEEF_BUILD_BENCH)
11+
add_subdirectory("benchmarks")
12+
endif()
13+
1014
if (SLEEF_BUILD_DFT)
1115
add_subdirectory("dft")
1216
if (SLEEF_BUILD_TESTS)

src/benchmarks/CMakeLists.txt

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# CMakeList.txt : Microbenchmarks with google bench
2+
cmake_minimum_required (VERSION 3.11)
3+
4+
project ("SLEEF Microbenchmarks")
5+
6+
find_package(Threads)
7+
# Apply CMake options in Installation guide in https://github.com/google/benchmark
8+
include(ExternalProject)
9+
find_package(Git REQUIRED)
10+
ExternalProject_Add(googlebenchmark
11+
GIT_REPOSITORY https://github.com/google/benchmark.git
12+
GIT_TAG origin/main
13+
CMAKE_ARGS -DBENCHMARK_DOWNLOAD_DEPENDENCIES=ON
14+
-DCMAKE_BUILD_TYPE=Release
15+
-DCMAKE_INSTALL_PREFIX=${CMAKE_BINARY_DIR}/googlebench
16+
-DBENCHMARK_ENABLE_GTEST_TESTS=OFF
17+
)
18+
include_directories(${CMAKE_BINARY_DIR}/googlebench/include)
19+
link_directories(${CMAKE_BINARY_DIR}/googlebench/lib)
20+
21+
# include headers
22+
include_directories(${sleef_BINARY_DIR}/include) # sleef.h
23+
# include libs
24+
link_directories(${sleef_BINARY_DIR}/lib) # libsleef
25+
26+
27+
set(Extra_CFLAGS -Wall -O3 -Wno-attributes)
28+
set(BENCH_SRC_FILE "benchsleef.cpp" "benchmark_callers.hpp" "benchmark_templates.hpp" "gen_input.hpp" "type_defs.hpp")
29+
set(BENCH_PROPERTIES C_STANDARD 99 CXX_STANDARD 17)
30+
set(BENCH_LIBS benchmark sleef Threads::Threads) # Link Google Benchmark and sleef to the project
31+
32+
# Add source to this project's executable.
33+
add_executable (benchsleef128 ${BENCH_SRC_FILE})
34+
set_target_properties(benchsleef128 PROPERTIES ${BENCH_PROPERTIES})
35+
target_compile_options(benchsleef128 PRIVATE ${Extra_CFLAGS} -march=native)
36+
target_link_libraries(benchsleef128 ${BENCH_LIBS})
37+
add_dependencies(benchsleef128 googlebenchmark)
38+
39+
if(CMAKE_SYSTEM_PROCESSOR MATCHES "(x86)|(X86)|(amd64)|(AMD64)")
40+
add_executable (benchsleef256 ${BENCH_SRC_FILE})
41+
set_target_properties(benchsleef256 PROPERTIES ${BENCH_PROPERTIES})
42+
target_compile_options(benchsleef256 PRIVATE ${Extra_CFLAGS} "-march=native" "-DARCH_VECT_LEN=256")
43+
target_link_libraries(benchsleef256 ${BENCH_LIBS})
44+
add_dependencies(benchsleef256 googlebenchmark)
45+
46+
add_executable (benchsleef512 ${BENCH_SRC_FILE})
47+
set_target_properties(benchsleef512 PROPERTIES ${BENCH_PROPERTIES})
48+
target_compile_options(benchsleef512 PRIVATE ${Extra_CFLAGS} "-mavx512f" "-DARCH_VECT_LEN=512")
49+
target_link_libraries(benchsleef512 ${BENCH_LIBS})
50+
add_dependencies(benchsleef512 googlebenchmark)
51+
endif()

src/benchmarks/README.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
<h2 id="benchmark">Benchmarking tool</h2>
2+
3+
This tool uses the [googlebench](https://github.com/google/benchmark) framework to benchmark SLEEF
4+
functions.
5+
It is integrated with SLEEF via CMake.
6+
In order to build this tool automatically when SLEEF is
7+
built, pass the `-DSLEEF_BUILD_BENCH=ON` CMake option when
8+
setting up the build directory:
9+
```sh
10+
cmake -S . -B build -DSLEEF_BUILD_BENCH=ON
11+
```
12+
After building SLEEF:
13+
```sh
14+
cmake --build build -j
15+
```
16+
in `build/bin` folder you will find an executable named
17+
benchsleef128.
18+
Run this executable with `./build/bin/benchsleef128` in
19+
order to obtain microbenchmarks for the functions in the project.
20+
A filter option can also be provided to the executable.
21+
This feature in inherited from googlebench, and takes
22+
a regular expression, and executes only the benchmarks
23+
whose name matches the regular expression.
24+
The set of all the benchmarks available can be obtained
25+
when running the benchmark tool when no filter is set
26+
and corresponds to all the benchmarks listed in
27+
`benchsleef.cpp`.
28+
```sh
29+
# Examples:
30+
# * This will benchmark Sleef_sinf_u10 on all intervals enabled in the tool.
31+
./build/bin/benchsleef128 --benchmark_filter=sinf_u10
32+
# * This will benchmark all single precision sin functions (scalar, vector and sve if available):
33+
./build/bin/benchsleef128 --benchmark_filter=sinf
34+
# * This will benchmark all single precision vector functions:
35+
./build/bin/benchsleef128 --benchmark_filter=vectorf
36+
```
37+
Note: all corresponds to all functions available in SLEEF and enabled in the benchmarks in this context.
38+
<h3 id="benchmark">Benchmarking on aarch64</h3>
39+
If you're running SLEEF on a machine with SVE support the executable generated will have SVE benchmarks
40+
available for functions specified in `benchsleef.cpp`.
41+
<h3 id="benchmark">Benchmarking on x86</h3>
42+
If you're running SLEEF on an x86 machine, two extra
43+
executables may be built (according to feature detection):
44+
45+
```sh
46+
./build/bin/benchsleef256
47+
./build/bin/benchsleef512
48+
```
49+
50+
These will benchmark 256bit and 512bit vector implementations
51+
for vector functions respectively.
52+
Note these executables can also be used to benchmark scalar
53+
functions.
54+
55+
<h3 id="benchmark">Maintenance</h3>
56+
Some functions are still not enabled in the benchmarks.
57+
In order to add a function which uses the types already
58+
declared in `type_defs.hpp`, add a benchmark entry using
59+
the macros declared in `benchmark_callers.hpp`.
60+
These macros have been designed to group benchmarking
61+
patterns observed in the previous benchmarking system,
62+
and minimize the number of lines of code while preserving
63+
readability as much as possible.
64+
65+
Examples:
66+
67+
(1) If a scalar float lower ulp precision version of
68+
log1p gets implemented at some point in SLEEF one could
69+
add benchmarks for it by adding a line to `sleefbench.cpp`:
70+
```cpp
71+
BENCH(Sleef_log10f_u35, scalarf, <min>, <max>)
72+
```
73+
This line can be repeated to provide benchmarks on
74+
multiple intervals.
75+
76+
(2) If the double precision of the function above gets
77+
implemented as well then, we can simply add:
78+
```cpp
79+
BENCH_SCALAR(log10, u35, <min>, <max>)
80+
```
81+
which would be equivalent to adding:
82+
```cpp
83+
BENCH(Sleef_log10f_u35, scalarf, <min>, <max>)
84+
BENCH(Sleef_log10_u35, scalard, <min>, <max>)
85+
```
86+
If the function you want to add does not use the types in
87+
`type_defs.hpp`, extend this file with the types required
88+
(and ensure type detection is implemented correctly).
89+
Most likely you will also have to make some changes to
90+
`gen_input.hpp`:
91+
* Add adequate declaration for `vector_len`:
92+
```cpp
93+
template <> const inline int vector_len<new_type> = *;
94+
```
95+
* and add adequate template specialization for `gen_input()`:
96+
```cpp
97+
template <> newtype gen_input (double lo, double hi)
98+
{ your implementation }
99+
```
100+
<h3 id="benchmark">Note</h3>
101+
This tool can also be built as a standalone project.
102+
From `sleef/src/benchmarks` directory, run:
103+
```sh
104+
cmake -S . -B build -Dsleef_BINARY_DIR=<build_dir>
105+
cmake --build build -j
106+
./build/benchsleef128
107+
```
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
// Copyright Naoki Shibata and contributors 2024.
2+
// Distributed under the Boost Software License, Version 1.0.
3+
// (See accompanying file LICENSE.txt or copy at
4+
// http://www.boost.org/LICENSE_1_0.txt)
5+
6+
#pragma once
7+
#include "benchmark_templates.hpp"
8+
9+
// Define macros that can be used to generate benchmark calls (defined in
10+
// benchmark_templates.hpp).
11+
// Example to generate benchmarks for 1ULP sin(x) for x between 0 and 6.28:
12+
// BENCH(Sleef_sin_u10, double, 0, 6.28);
13+
// BENCHMARK_CAPTURE is a symbol from the google bench framework
14+
// Note: type is only passed for name filtering reasons
15+
#define BENCH(funname, typefilter, min, max) \
16+
BENCHMARK_CAPTURE(BM_Sleef_templated_function, #funname, funname, min, max) \
17+
->Name("MB_" #funname "_" #typefilter "_" #min "_" #max);
18+
19+
#define BENCH_SINGLE_SCALAR(fun, ulp, min, max) \
20+
BENCH(Sleef_##fun##f_##ulp, scalarf, min, max);
21+
#define BENCH_DOUBLE_SCALAR(fun, ulp, min, max) \
22+
BENCH(Sleef_##fun##_##ulp, scalard, min, max);
23+
// Generate benchmarks for scalar function implementations
24+
#define BENCH_SCALAR(fun, ulp, min, max) \
25+
BENCH_SINGLE_SCALAR(fun, ulp, min, max); \
26+
BENCH_DOUBLE_SCALAR(fun, ulp, min, max);
27+
28+
// Generate benchmarks for vector function implementations
29+
#ifdef ENABLE_VECTOR_BENCHMARKS
30+
#if !defined(ARCH_VECT_LEN) || ARCH_VECT_LEN == 128
31+
#define BENCH_SINGLE_VECTOR(fun, ulp, min, max) \
32+
BENCH(Sleef_##fun##f4_##ulp, vectorf128, min, max);
33+
#define BENCH_DOUBLE_VECTOR(fun, ulp, min, max) \
34+
BENCH(Sleef_##fun##d2_##ulp, vectord128, min, max);
35+
#elif ARCH_VECT_LEN == 256
36+
#define BENCH_SINGLE_VECTOR(fun, ulp, min, max) \
37+
BENCH(Sleef_##fun##f8_##ulp, vectorf256, min, max);
38+
#define BENCH_DOUBLE_VECTOR(fun, ulp, min, max) \
39+
BENCH(Sleef_##fun##d4_##ulp, vectord256, min, max);
40+
#elif ARCH_VECT_LEN == 512
41+
#define BENCH_SINGLE_VECTOR(fun, ulp, min, max) \
42+
BENCH(Sleef_##fun##f16_##ulp, vectorf512, min, max);
43+
#define BENCH_DOUBLE_VECTOR(fun, ulp, min, max) \
44+
BENCH(Sleef_##fun##d8_##ulp, vectord512, min, max);
45+
#endif
46+
#define BENCH_VECTOR(fun, ulp, min, max) \
47+
BENCH_SINGLE_VECTOR(fun, ulp, min, max); \
48+
BENCH_DOUBLE_VECTOR(fun, ulp, min, max);
49+
#else
50+
#define BENCH_SINGLE_VECTOR(fun, ulp, min, max)
51+
#define BENCH_DOUBLE_VECTOR(fun, ulp, min, max)
52+
#define BENCH_VECTOR(fun, ulp, min, max)
53+
#endif
54+
55+
// Generate benchmarks for SVE function implementations
56+
#ifdef ENABLE_SVECTOR_BENCHMARKS
57+
#define BENCH_SINGLE_SVE(fun, ulp, min, max) \
58+
BENCH(Sleef_##fun##fx_##ulp##sve, scalarf, min, max);
59+
#define BENCH_DOUBLE_SVE(fun, ulp, min, max) \
60+
BENCH(Sleef_##fun##dx_##ulp##sve, scalard, min, max);
61+
#define BENCH_SVE(fun, ulp, min, max) \
62+
BENCH_SINGLE_SVE(fun, ulp, min, max); \
63+
BENCH_DOUBLE_SVE(fun, ulp, min, max);
64+
#else
65+
#define BENCH_SINGLE_SVE(fun, ulp, min, max)
66+
#define BENCH_DOUBLE_SVE(fun, ulp, min, max)
67+
#define BENCH_SVE(fun, ulp, min, max)
68+
#endif
69+
70+
// Given a function implemented meeting a specific ulp
71+
// error (present in the name of the function),
72+
// BENCH_ALL_W_FIX_ULP macro will
73+
// generate benchmarks for
74+
// * all vector extensions supported
75+
// * all precisions
76+
// * all vector lengths
77+
#define BENCH_ALL_W_FIX_ULP(fun, ulp, min, max) \
78+
BENCH_SCALAR(fun, ulp, min, max); \
79+
BENCH_VECTOR(fun, ulp, min, max); \
80+
BENCH_SVE(fun, ulp, min, max);
81+
#define BENCH_SINGLEP_W_FIX_ULP(fun, ulp, min, max) \
82+
BENCH_SINGLE_SCALAR(fun, ulp, min, max); \
83+
BENCH_SINGLE_VECTOR(fun, ulp, min, max); \
84+
BENCH_SINGLE_SVE(fun, ulp, min, max);
85+
#define BENCH_DOUBLEP_W_FIX_ULP(fun, ulp, min, max) \
86+
BENCH_DOUBLE_SCALAR(fun, ulp, min, max); \
87+
BENCH_DOUBLE_VECTOR(fun, ulp, min, max); \
88+
BENCH_DOUBLE_SVE(fun, ulp, min, max);
89+
90+
#define BENCH_ALL_SINGLEP(fun, min, max) \
91+
BENCH_SINGLEP_W_FIX_ULP(fun, u10, min, max); \
92+
BENCH_SINGLEP_W_FIX_ULP(fun, u35, min, max);
93+
#define BENCH_ALL_DOUBLEP(fun, min, max) \
94+
BENCH_DOUBLEP_W_FIX_ULP(fun, u10, min, max); \
95+
BENCH_DOUBLEP_W_FIX_ULP(fun, u35, min, max);
96+
97+
// Given a function, BENCH_ALL macro will
98+
// generate benchmarks for
99+
// * all ulp implementations available (u10 and u35)
100+
// * all vector extensions supported
101+
// * all precisions
102+
// * all vector lengths
103+
#define BENCH_ALL(fun, min, max) \
104+
BENCH_ALL_W_FIX_ULP(fun, u10, min, max); \
105+
BENCH_ALL_W_FIX_ULP(fun, u35, min, max);

0 commit comments

Comments
 (0)