Skip to content

Latest commit

 

History

History
74 lines (45 loc) · 6.1 KB

readme.md

File metadata and controls

74 lines (45 loc) · 6.1 KB

NZ Wastewater Modelling

Jointly estimating epidemiological dynamics of Covid-19 from case and wastewater data in Aotearoa New Zealand

Scripts and functions required to reproduce the results in the paper and/or apply the methods to new datasets. The majority of the computation is performed in Julia, although we use R to perform some pre-processing and for generating figures in the corresponding paper.

Model outputs

We provide some example scripts with minimal working examples to help you get started:

  • runFilterExample.jl: Runs the core particle filter at single parameter vector theta. This is a good place to start.
  • runPMMHExample.jl: A shortened example of how to set-up and run the particle marginal Metropolis-Hastings algorithm.

We also provide scripts that reproduce the results in the paper:

  • resultsPMMH.jl: Runs the particle marginal Metropolis-Hastings algorithm used to estimate static parameters.
  • resultsFullPosterior.jl: Reads in the parameter samples generated by resultsPMMH.jl and produces estimates of our epidemiological quantities after marginalising out the static parameters.
  • resultsCurvewiseEsts.jl: Produces curvewise estimates of quantities of interest. See Supplementary Material section 2.9 for details.
  • resultsOtherText.jl: Prints out some of the key results we include in the paper. It also prints additional results that did not make it into the final text.
  • resultsTestingShifts.jl: Reproduces the code in resultsFullPosterior.jl, except with applied shifts to various distributions of interest as sensitivity analysis. Results are presented in Supplementary Material section 3.3.

Sub-folders are organised as follows:

  • /src/: Contains Julia scripts that implement core methods
  • /R/: Contains R scripts for pre-processing data, examining MCMC outputs, and producing figures
  • /usefulscripts/: Contains a collection of scripts that may or may not be useful. Use these at your own risk!
  • /outputs/: Model outputs and results. See the readme in the folder for organisation.
  • /data/: Self-explanatory. The only script that writes to this folder is /R/processNZData.R.

Quick start guide

The runFilterExample.jl script should be ready to go as-is - just click play. You may need to run /usefulscripts/installPackages.jl to install any missing packages. See below for links to install Julia if neeeded.

If you want to test this model on your own data, we suggest ignoring all the MCMC stuff and diving straight into the much simpler and faster particle filter using the same script. Few changes should be necessary:

  • Write your own version of loadNZData.jl that loads your data and returns a DataFrame with columns: date, t, nCases, nGenomeCopies, nPopInCatchment, wwDataIsValid, and casesDataIsValid.
  • Specify a vector of fixed parameters (see below for some plausible initial values)
  • Run the runFilter.jl script on your data (edit lines 17 and 41-44)
  • Plot the results!

The parameter vector in the default model has length 4 with elements representing: [sigmaCAR, sigmaR, kc, kw].

Except in periods of rapid change (where you might need to increase some of the values for added flexibility), we expect reasonable starting values to be sigmaCAR = 0.01 and sigmaR = 0.05. The values of kc and kw are more data-dependent, but starting with kc = 50 and kw = 4x10^-6 feels appropriate.

If these values lead to degeneracy try loosening the observation distributions by decreasing kc and kw, or increase the flexibility of the hidden states by increasing sigmaCAR and sigmaR. If your estimates are jumping around a lot and the confidence intervals are too wide then decreasing the parameters can help smooth things out.

runPPMHExample.jl can then be used to fit the static parameters.

Some tips

If you are estimating very low or very high CARt try editing the opts["alpha"] parameter. This could vary substantially depending on your data, so a back-of-the-napkin calculation could inform your choice for this parameter.

Julia can be downloaded here: https://julialang.org/downloads/. VSCode is a popular IDE with good Julia support and can be downloaded here: https://code.visualstudio.com/Download. Finally the Julia extension for VSCode is useful and can be installed from the VSCode extensions manager.

Full workflow used in the paper

Step 0: Data pre-processing

Before running the core methods we use the /R/processNZData.R script to pre-process the reported cases and wastewater sampling data.

This script is fairly self-explanatory, and unless something has gone wrong, the outputs will already be saved in /data/.

Step 1: Parameter estimation

Before estimating our hidden states we need to generate samples from our posterior distribution on our parameters P(theta|data) using a particle marginal Metropolis-Hastings (PMMH) algorithm. This is the most computationally expensive step so we recommend running on multithreaded HPC where possible.

resultsPMMH.jl automatically separates our data into the pre-defined periods, sets the PMMH hyperparameters appropriately for each period, and saves the results to the /outputs/pmmh.nosync/ folder. This script uses the ArgParse package to allow for command line arguments. Lines 153-155 can be commented out and replaced with lines 148-150 in order to run the code from an IDE.

Step 2: Posterior estimation

Once we have our samples of the static parameters we can estimate the marginal distributions of our hidden states after effectively integrating out the static parameters. This is done by sampling from our posterior distribution on our static parameters and running a separate particle filter conditional on each parameter sample. This is handled by resultsFullPosterior.jl which saves the results to the /outputs/fullposterior/ folder.

Step 3: Other analysis

Plotting and other analysis is then handled by the remaining resultsX.jl scripts. The R code in /R/ is used to generate the figures and tables presented in the paper.