benchmarks.qmd

---
title: "Benchmarks"
toc: true

---

```{julia}
#| echo: false
#| output: false
using Pkg
Pkg.activate("env/benchmarks")
Pkg.instantiate()
```

## The Life Modeling Problem

Inspired by the discussion in the [ActuarialOpenSource](https://github.com/actuarialopensource) GitHub community discussion, folks started submitted solutions to what someone referred to as the "Life Modeling Problem". This user [submitted a short snippet](https://github.com/orgs/actuarialopensource/teams/common-room/discussions/5) for consideration of a representative problem.

### Benchmarks

After the original user submitted a proposal, others chimed in and submitted versions in their favorite languages. I have collected those versions, and run them on a consistent set of hardware.

Some submissions were excluded because from the benchmarks they involved an entirely different approach, such as [memoizing](https://en.wikipedia.org/wiki/Memoization) the function calls[^1].


```{julia}
#| echo: false
#| label: tbl-life-modeling-benchmark
#| tbl-cap: Benchmarks for the Life Modeling Problem in nanoseconds (lower times are better).
#| 
using CSV, DataFrames
using PrettyTables
file = download("https://raw.githubusercontent.com/JuliaActuary/Learn/master/Benchmarks/LifeModelingProblem/benchmarks.csv")
benchmarks = CSV.read(file, DataFrame)

benchmarks.relative_mean = benchmarks.mean ./ minimum(benchmarks.mean)
sort(benchmarks, :relative_mean)
```

To aid in visualizing results with such vast different orders of magnitude, this graph includes a physical length comparison to serve as a reference. The computation time is represented by the distance that light travels in the time for the computation to complete (comparing a nanosecond to one foot length [goes at least back to Admiral Grace Hopper](https://www.youtube.com/watch?v=9eyFDBPk4Yw)).

```{julia}
#| echo: false
#| label: fig-life-modeling-benchmark
using Plots
using DataFrames
p = plot(palette=:seaborn_colorblind, rotation=25, yaxis=:log)
# label equivalents to distance to make log scale more relatable
scatter!(
    fill("\n equivalents (ns → ft)", 7),
    [1, 1e1, 1e2, 1e3, .8e4, 0.72e5, 3.3e6],
    series_annotations=Plots.text.(["1 foot", "basketball hoop", "blue whale", "Eiffel Tower", "avg ocean depth", "marathon distance", "Space Station altitude"], :left, 8, :grey),
    marker=0,
    label="",
    left_margin=20Plots.mm,
    bottom_margin=20Plots.mm
)

# plot mean, or median if not available
for g in groupby(benchmarks, :algorithm)
    scatter!(p, g.lang,
        ifelse.(ismissing.(g.mean), g.median, g.mean),
        label="$(g.algorithm[1])",
        ylabel="Nanoseconds (log scale)",
        marker=(:circle, 5, 0.7, stroke(0)))
end
p
```

### Discussion

For more a more in-depth discussion of these results, see [this post](/posts/life-modeling-problem/).

All of the benchmarked code can be found in the [JuliaActuary Learn repository](https://github.com/JuliaActuary/Learn/tree/master/Benchmarks/LifeModelingProblem). Please file an issue or submit a PR request there for issues/suggestions.

## IRRs

**Task:** determine the IRR for a series of cashflows 701 elements long (e.g. monthly cashflows for 60 years).

### Benchmarks

:::{.column-body-outset}
```plaintext
Times are in nanoseconds:
┌──────────┬──────────────────┬───────────────────┬─────────┬─────────────┬───────────────┐
│ Language │          Package │          Function │  Median │        Mean │ Relative Mean │
├──────────┼──────────────────┼───────────────────┼─────────┼─────────────┼───────────────┤
│   Python │  numpy_financial │               irr │ missing │   519306422 │       123146x │
│   Python │           better │ irr_binary_search │ missing │     3045229 │          722x │
│   Python │           better │        irr_newton │ missing │      382166 │           91x │
│    Julia │ ActuaryUtilities │               irr │    4185 │        4217 │            1x │
└──────────┴──────────────────┴───────────────────┴─────────┴─────────────┴───────────────┘
```
:::

### Discussion

The ActuaryUtilities implementation is over 100,000 times faster than `numpy_financial`, and 91 to 722 times faster than the `better` Python package. The [ActuaryUtilities.jl](/#actuaryutilitiesjl) implementation is also more flexible, as it can be given an argument with timepoints, similar to Excel's `XIRR`.

Excel was used to attempt a benchmark, but the `IRR` formula returned a `#DIV/0!` error.

All of the benchmarked code can be found in the [JuliaActuary Learn repository](https://github.com/JuliaActuary/Learn/tree/master/Benchmarks/irr). Please file an issue or submit a PR request there for issues/suggestions.

## Black-Scholes-Merton European Option Pricing

**Task:** calculate the price of a vanilla european call option using the Black-Scholes-Merton formula.


\begin{align}
C(S_t, t) &= N(d_1)S_t - N(d_2)Ke^{-r(T - t)} \\

d_1 &= \frac{1}{\sigma\sqrt{T - t}}\left[\ln\left(\frac{S_t}{K}\right) + \left(r + \frac{\sigma^2}{2}\right)(T - t)\right] \\

d_2 &= d_1 - \sigma\sqrt{T - t}

\end{align}

### Benchmarks

```plaintext
Times are in nanoseconds:
┌──────────┬─────────┬─────────────┬───────────────┐
│ Language │  Median │        Mean │ Relative Mean │
├──────────┼─────────┼─────────────┼───────────────┤
│   Python │ missing │    817000.0 │       19926.0 │
│        R │  3649.0 │      3855.2 │          92.7 │
│    Julia │    41.0 │        41.6 │           1.0 │
└──────────┴─────────┴─────────────┴───────────────┘
```

### Discussion

Julia is nearly 20,000 times faster than Python, and two orders of magnitude faster than R.

## Other benchmarks

These benchmarks have been performed by others, but provide relevant information for actuarial-related work:

- [H2Oai DataFrames/Database-like Operations](https://h2oai.github.io/db-benchmark/)
- [Reading CSVs](https://juliacomputing.com/blog/2020/06/fast-csv/)


## Colophone

### Code

All of the benchmarked code can be found in the [JuliaActuary Learn repository](https://github.com/JuliaActuary/Learn/tree/master/Benchmarks/). Please file an issue or submit a PR request there for issues/suggestions.

## Footnotes

[^1] If benchmarking memoization, it's essentially benchmarking how long it takes to perform hashing in a language. While interesting, especially in the context of [incremental computing](https://scattered-thoughts.net/writing/an-opinionated-map-of-incremental-and-streaming-systems), it's not the core issue at hand. Incremental computing libraries exist for all of the modern languages discussed here.

[^2] Note that not all languages have both a mean and median result in their benchmarking libraries. Mean is a better representation for a garbage-collected modern language, because sometimes the computation just takes longer than the median result. Where the mean is not available in the graph below, median is substituted.