Computation of binned statistical moments with CUDA, CuPy and CFFI.
Create a Gaussian distribution of particles each with a bin index:
python create_distribution.py
- this saves a .npz file to
/inputs
Run CPU benchmark using CFFI:
python c_cpu.py
CPU: AMD Ryzen Threadripper 2970WX 24-Core Processor
cffi v1.15.0
Run GPU benchmark using CuPy:
python c_gpu.py
GPU: NVIDIA TITAN V
cupy-cuda101 v9.6.0
- these save a .npz file with the slice moments in
/outputs
Compare output moments:
python compare_moments.py
The plot below shows the speedup achieved on the GPU compared to the single CPU execution, as a function of the block size and the number of particles in the distribution, for 100 bins.