@def author = "Misc" @def date = "August 7, 2021" @def title = "Benchmarks"
@def rss_pubdate = Date(2021,8,7) @def rss = "Actuarial-related benchmarks."
\toc
Inspired by the discussion in the ActuarialOpenSource GitHub community discussion, folks started submitted solutions to what someone referred to as the "Life Modeling Problem". This user submitted a short snippet for consideration of a representative problem.
After the original user submitted a proposal, others chimed in and submitted versions in their favorite languages. I have collected those versions, and run them on a consistent set of hardware.
Some submissions were excluded because from the benchmarks they involved an entirely different approach, such as memoizing the function calls[^1].
#hideall
using CSV, DataFrames
using PrettyTables
file = download("https://raw.githubusercontent.com/JuliaActuary/Learn/master/Benchmarks/LifeModelingProblem/benchmarks.csv")
benchmarks = CSV.read(file,DataFrame)
benchmarks.relative_mean = benchmarks.mean ./ minimum(benchmarks.mean)
header = (["Language", "Algorithm", "Function name", "Median","Mean","Relative Mean"],
[ "", "", "", "[nanoseconds]","[nanoseconds]",""]);
pretty_table(benchmarks;header,formatters = ft_printf("%'.1d"))
\output{./code/getdata}
To aid in visualizing results with such vast different orders of magnitude, this graph includes a physical length comparison to serve as a reference. The computation time is represented by the distance that light travels in the time for the computation to complete (comparing a nanosecond to one foot length goes at least back to Admiral Grace Hopper).
#hideall
using Plots
using DataFrames
p = plot(palette = :seaborn_colorblind,rotation=25,yaxis=:log)
# label equivalents to distance to make log scale more relatable
scatter!(
fill("\n equivalents (ns → ft)",7),
[1,1e1,1e2,1e3,.8e4,0.72e5,3.3e6],
series_annotations=Plots.text.(["1 foot","basketball hoop","blue whale","Eiffel Tower","avg ocean depth","marathon distance","Space Station altitude"], :left, 8,:grey),
marker=0,
label="",
left_margin=20Plots.mm,
bottom_margin=20Plots.mm
)
# plot mean, or median if not available
for g in groupby(benchmarks,:algorithm)
scatter!(p,g.lang,
ifelse.(ismissing.(g.mean),g.median,g.mean),
label="$(g.algorithm[1])",
ylabel="Nanoseconds (log scale)",
marker = (:circle, 5, 0.7, stroke(0)))
end
savefig(joinpath(@OUTPUT,"lmp_benchmarks.svg"))
\fig{lmp_benchmarks.svg}
For more a more in-depth discussion of these results, see this post.
All of the benchmarked code can be found in the JuliaActuary Learn repository. Please file an issue or submit a PR request there for issues/suggestions.
Task: determine the IRR for a series of cashflows 701 elements long.
Times are in nanoseconds:
┌──────────┬──────────────────┬───────────────────┬─────────┬─────────────┬───────────────┐
│ Language │ Package │ Function │ Median │ Mean │ Relative Mean │
├──────────┼──────────────────┼───────────────────┼─────────┼─────────────┼───────────────┤
│ Python │ numpy_financial │ irr │ missing │ 5339167688 │ 332824x │
│ Python │ better │ irr_binary_search │ missing │ 6167798 │ 384x │
│ Python │ better │ irr_newton │ missing │ 945813 │ 59x │
│ Julia │ ActuaryUtilities │ irr │ 16000 │ 16042 │ 1x │
└──────────┴──────────────────┴───────────────────┴─────────┴─────────────┴───────────────┘
The ActuaryUtilities implementation is over 300,000 times faster than numpy_financial
, and 59 to 384 times faster than the better
Python package. The ActuaryUtilities.jl implementation is also more flexible, as it can be given an argument with timepoints, similar to Excel's XIRR
.
Excel was used to attempt a benchmark, but the IRR
formula returned a #DIV/0!
error.
All of the benchmarked code can be found in the JuliaActuary Learn repository. Please file an issue or submit a PR request there for issues/suggestions.
Task: calculate the price of a vanilla european call option using the Black-Scholes-Merton formula.
\begin{align} C(S_t, t) &= N(d_1)S_t - N(d_2)Ke^{-r(T - t)} \
d_1 &= \frac{1}{\sigma\sqrt{T - t}}\left[\ln\left(\frac{S_t}{K}\right) + \left(r + \frac{\sigma^2}{2}\right)(T - t)\right] \
d_2 &= d_1 - \sigma\sqrt{T - t}
\end{align}
Times are in nanoseconds:
┌──────────┬─────────┬─────────────┬───────────────┐
│ Language │ Median │ Mean │ Relative Mean │
├──────────┼─────────┼─────────────┼───────────────┤
│ Python │ missing │ 817000.0 │ 19926.0 │
│ R │ 3649.0 │ 3855.2 │ 92.7 │
│ Julia │ 41.0 │ 41.6 │ 1.0 │
└──────────┴─────────┴─────────────┴───────────────┘
Julia is nearly 20,000 times faster than Python, and two orders of magnitude faster than R.
These benchmarks have been performed by others, but provide relevant information for actuarial-related work:
All of the benchmarked code can be found in the JuliaActuary Learn repository. Please file an issue or submit a PR request there for issues/suggestions.
Macbook Air (M1, 2020)
All languages/libraries are Mac M1 native unless otherwise noted
Julia Version 1.7.0-DEV.938
Commit 2b4c088ee7* (2021-04-16 20:37 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin20.3.0)
CPU: Apple M1
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, cyclone)
1.61.0-nightly (f103b2969 2022-03-12)
Python 3.9.7
numba 0.54.1 py39hae1ba45_0
numpy 1.20.3 py39h4b4dc7a_0
R Under development (unstable) (2021-04-16 r80179) -- "Unsuffered Consequences"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20.0 (64-bit)
[^1] If benchmarking memoization, it's essentially benchmarking how long it takes to perform hashing in a language. While interesting, especially in the context of incremental computing, it's not the core issue at hand. Incremental computing libraries exist for all of the modern languages discussed here.
[^2] Note that not all languages have both a mean and median result in their benchmarking libraries. Mean is a better representation for a garbage-collected modern language, because sometimes the computation just takes longer than the median result. Where the mean is not available in the graph below, median is substituted.