You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to a current default in the boost library (boostorg/math#1211) in boost::math::digamma, there is a performance hit on aarch64.
This happens on v1.10.3 of Salmon, with GNU compiler 13 on Linux aarch64.
A 4-thread quantization of one of the Salmon tutorials DRR0* series files spends ~15% of time in this routine (called within CollapsedEMOptimizer). On a larger example, we see 7% performance hit over a run that takes 1300 seconds on 4 cores. On x86 this time is small enough to be lost in the noise.
There is a simple fix which is to ensure the CMake/Makefiles ensure salmon compiles with: -DBOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
or to add that to any file that brings in boost::math via adding #define BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS at the start.
With that change, a 1300 second runtime drops to 1212 for the larger test case, and for the tutorial case is 48 seconds down to 40 on a 4-core r8g.xlarge (Graviton4).
Whilst Boost may fix the issue soon - it's likely that older versions of the library will be found installed for some time. It would be helpful to add this define to cmake settings, or the sources.
The text was updated successfully, but these errors were encountered:
Set boost options in CMakeLists.txt to not promote doubles to long doubles. Promotion
incurs performance penalty on all platforms, and particularly aarch64.
dslarm
added a commit
to dslarm/salmon
that referenced
this issue
Oct 23, 2024
Set boost options in CMakeLists.txt to not promote doubles to long doubles. Promotion
incurs performance penalty on all platforms, and particularly aarch64.
Due to a current default in the boost library (boostorg/math#1211) in boost::math::digamma, there is a performance hit on aarch64.
This happens on v1.10.3 of Salmon, with GNU compiler 13 on Linux aarch64.
A 4-thread quantization of one of the Salmon tutorials DRR0* series files spends ~15% of time in this routine (called within CollapsedEMOptimizer). On a larger example, we see 7% performance hit over a run that takes 1300 seconds on 4 cores. On x86 this time is small enough to be lost in the noise.
salmon quant -i athal_index -l A -1 DRR016125/DRR016125_1.fastq.gz -2 DRR016125/DRR016125_2.fastq.gz -p $threads --validateMappings -o quants/DRR016125_quant
There is a simple fix which is to ensure the CMake/Makefiles ensure salmon compiles with:
-DBOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
or to add that to any file that brings in boost::math via adding
#define BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
at the start.With that change, a 1300 second runtime drops to 1212 for the larger test case, and for the tutorial case is 48 seconds down to 40 on a 4-core r8g.xlarge (Graviton4).
Whilst Boost may fix the issue soon - it's likely that older versions of the library will be found installed for some time. It would be helpful to add this define to cmake settings, or the sources.
The text was updated successfully, but these errors were encountered: