UCL · matt-graham · Feb 17, 2025 · Feb 17, 2025 · Feb 17, 2025 · Feb 17, 2025
diff --git a/.github/workflows/build-paper.yaml b/.github/workflows/build-paper.yaml
@@ -0,0 +1,24 @@
+name: Build draft paper PDF
+on:
+  push:
+    paths:
+      - joss/**
+      - .github/workflows/build-paper.yaml
+
+jobs:
+  paper:
+    runs-on: ubuntu-latest
+    name: Paper draft
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Build draft paper PDF
+        uses: openjournals/openjournals-draft-action@master
+        with:
+          journal: joss
+          paper-path: joss/paper.md
+      - name: Upload
+        uses: actions/upload-artifact@v4
+        with:
+          name: paper
+          path: joss/paper.pdf
diff --git a/joss/paper.md b/joss/paper.md
@@ -0,0 +1,146 @@
+---
+title: "rmcmc: Robust Markov chain Monte Carlo methods in R"
+tags:
+  - R
+  - Bayesian inference
+  - computational statistics
+  - Monte Carlo
+  - Markov chain Monte Carlo
+authors:
+  - given-names: Matthew M. 
+    surname: Graham
+    orcid: 0000-0001-9104-7960
+    affiliation: 1
+    corresponding: true
+  - given-names: Samuel
+    surname: Livingstone
+    orcid: 0000-0002-7277-086X
+    affiliation: 1
+affiliations:
+  - name: University College London
+    index: 1
+    ror: 02jx3x895
+date: 17 February 2025
+bibliography: references.bib
+---
+
+# Summary
+
+Generating samples from a probability distribution
+is a common requirement in many disciplines.
+In Bayesian inference, for example, the distribution of interest is the posterior over parameters of a model,
+given data assumed to be generated from the model.
+Expectations with respect to the posterior can be easily estimated using samples,
+allowing computation of posterior means, variances and other quantities of interest.
+_Markov chain Monte Carlo_ (MCMC) methods
+are a general class of algorithms for approximately sampling from probability distributions.
+
+`rmcmc` is an R package providing implementations of MCMC methods for sampling from distributions on $\mathbb{R}^d$ for $d \geq 1$.
+It provides a general purpose implementation of the Barker proposal [@livingstone2022barker],
+a gradient-based MCMC algorithm inspired by the Barker accept-reject rule [@barker1965monte].
+It also provides implementations of other MCMC algorithms based on random walk, Langevin and Hamiltonian dynamics,
+and a flexible interface for performing adaptive MCMC
+so that algorithmic tuning parameters can be learned in a bespoke manner [@haario2001adaptive;@andrieu2008tutorial].
+The key function provided by the package is `sample_chain`,
+which allows sampling a Markov chain with a user-specified stationary distribution.
+The chain is sampled by generating proposals
+and accepting or rejecting them using the Metropolis--Hastings [@metropolis1953equation;@hastings1970monte] algorithm.
+During an initial warm-up stage, the parameters of the proposal distribution can be adapted,
+with schemes available to both:
+tune the scale of the proposals by coercing the average acceptance rate to a target value;
+tune the shape of the proposals to match covariance estimates under the target distribution.
+
+# Statement of need
+
+For target distributions with smooth density functions,
+gradient-based MCMC methods can provide significantly improved sampling efficiency
+over simpler schemes such as _random-walk Metropolis_ (RWM),
+particularly in high-dimensional settings [@roberts1996exponential;@roberts1998optimal;@beskos2013optimal].
+For schemes such as _Metropolis adjusted Langevin algorithm_ (MALA) [@rossky1978brownian;@besag1994comments]
+and _Hamiltonian Monte Carlo_ (HMC)  [@duane1987hybrid;@neal2011mcmc],
+this improved efficiency can, however, come at the cost of decreased robustness to tuning of the algorithm's parameters,
+compared to non-gradient based methods such as RWM [@livingstone2022barker].
+The Barker proposal provides a middle road,
+offering similar efficiency in high-dimensional settings as other gradient-based methods such as MALA [@vogrinc2023optimal],
+while allowing easier adaptive tuning of the algorithm's parameters,
+and improved robustness to certain forms of irregularity, such as skewness,
+in the target distribution [@hird2020fresh].
+
+`rmcmc` fills a niche in the R statistical computing ecosystem,
+by providing general purpose implementations of the Barker proposal, as well as RWM, MALA and HMC,
+along with several schemes for adapting the proposal parameters.
+Significant flexibility is available in specifying the log density of the target distribution,
+and its gradient.
+Users can either directly define functions to compute the log density and its gradient,
+specify a formula for the log density which will then be symbolically differentiated using the
+`deriv` function in the base R `stats` package [@rcoreteam2024r],
+or define a model using Stan's modelling syntax
+via the R interface of BridgeStan [@carpenter2017stan;@roualdes2023bridgestan].
+The package has a modular design,
+which allow users to easily try out different algorithmic components and options,
+and to extend the package with new algorithms.
+This is exemplified in the Barker proposal implementation,
+which allows customizing the distribution over the auxiliary variables used in generating the proposal.
+As discussed in @vogrinc2023optimal and illustrated in a package vignette,
+this can significantly improvement sampling efficiency in some cases.
+`rmcmc` has a pure R codebase with minimal required dependencies,
+making it a lightweight addition to other projects.
+The package also interfaces with several others in addition to BridgeStan,
+for instance a wrapper around the _robust adaptive Metropolis_ (RAM) [@vihola2012robust]
+adaptation scheme in the `ramcmc` package is provided [@helske2021ramcmc],
+and sampled chain traces can be directly passed to functions
+for computing summary statistics and convergence diagnostics in the `posterior` or `coda` packages [@burkner2024posterior;@plummer2015coda].
+
+# Related software
+
+Several other R packages provide implementations of MCMC methods.
+`mcmc` [@geyer2023mcmc] provides a general purpose implementation of RWM,
+along with a 'morphometric' variant  [@johnson2012variable],
+which reparametrizes the target distribution to improve efficiency.
+`MCMCpack` [@martin2011mcmcmpack] focuses on providing customized MCMC methods
+for specific classes of statistical model that exploit the structure of the model,
+though it does also provide a general purpose RWM implementation.
+`fmcmc` [@yon2019fmcmc] is similar in design to `rmcmc`,
+providing a modular framework with a variety of pre-defined proposals such as Gaussian and uniform RWM,
+and adaptive methods such as RAM and adaptive Metropolis [@haario2001adaptive] methods.
+None of `mcmc`, `MCMCpack` or `fmcmc` provide gradient-based MCMC methods,
+which can significantly improve sampling efficiency, as discussed above.
+
+The GitHub repository `gzanella/barker` [@zanella2019barker]
+contains code to recreate the numerical experiments in @livingstone2022barker,
+and provides a basic implementation of the Barker proposal.
+The R scripts in the repository are not, however, structured into a package,
+complicating reuse in other projects,
+and the implementation only provides support for sampling from the proposal
+and evaluating its log density ratio.
+
+Stan and NIMBLE [@perry2017programming],
+with associated R interfaces `rstan` [@stan2024rstan] and `nimble` [@perry2024nimble],
+are _probabilistic programming languages_ (PPLs),
+domain specific languages for the specification of probabilistic models.
+Both Stan and NIMBLE also provide implementations of a variety algorithms
+to perform inference in models defined via their PPLs,
+and in particular both offer gradient-based MCMC methods.
+
+Stan's default MCMC implementation is a HMC method,
+which dynamically sets the trajectory lengths when simulating Hamiltonian dynamics to generate proposals [@hoffman2014no;@betancourt2017conceptual],
+and also includes schemes for adapting an algorithm's scale (step-size) and shape (metric) parameters, but with limited user-flexibility.
+Stan also does not currently provide an implementation of the Barker proposal.
+`rmcmc` does also offer a basic HMC implementation, but currently only supports HMC with static or randomized trajectory lengths.
+One of the proposal scale adaptation schemes implemented in `rmcmc`,
+is a dual-averaging algorithm [@nesterov2009primal;@hoffman2014no] matching the corresponding scheme in Stan,
+but in contrast to Stan alternative schemes are available in `rmcmc`.
+
+NIMBLE allows definitions of both models and statistical algorithms in its PPL,
+and provides implementations of a variety of MCMC methods including,
+RWM, HMC [@turek2024nimblehmc], and the Barker proposal.
+Compared to `rmcmc`, NIMBLE's Barker proposal implementation is tightly integrated into the broader NIMBLE framework,
+and so can only easily be applied to models specified in NIMBLE.
+NIMBLE's Barker proposal implementation provides support for adapting the proposal scale and shape parameters,
+however unlike `rmcmc` the adaptation scheme provided is fixed,
+without an ability to swap in alternative adaptation schemes.
+
+# Acknowledgements
+
+Development of `rmcmc` was supported by a UK Engineering and Physical Sciences Research Council
+New Investigator Award EP/V055380/1.