-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add episode on contact matrices #63
Open
amanda-minter
wants to merge
31
commits into
epiverse-trace:main
Choose a base branch
from
amanda-minter:contact-matrices
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
53b99ef
add template for new episode
amanda-minter 3e986d1
add episode to appear first
amanda-minter 5babcbf
move some content between episodes
amanda-minter 464c9c7
add content on SIR versus age structure SIR
amanda-minter 10e3bba
add callout on how to normalise a matrix
amanda-minter 872de06
add intro section and change headings
amanda-minter 575089e
add section on socialmixr
amanda-minter fc398de
edit and move normalisation callout
amanda-minter 300fcac
add link to simulating transmission
amanda-minter 50f1bbe
edits to text
amanda-minter 2948012
add list of example analyses
amanda-minter 3b2580a
delete trailing whitespace
amanda-minter bc50ef5
Apply suggestions from code review
amanda-minter 0f8bf27
add callout on synthetic matrices
amanda-minter 79b6c06
add edits by @adamkucharski
amanda-minter e4cc034
add additional detail on normalisation
amanda-minter 0971559
lint file
amanda-minter fe1c29e
add text on contact matrix conversions
amanda-minter 46a351c
fix contact matrix notation
amanda-minter df6c1f6
minor edit to text
amanda-minter 6b220ff
update teaching times
amanda-minter ada119a
Apply suggestions from code review
amanda-minter 380dda9
update contact matrix notation
amanda-minter 9f2b819
update equations for `model_default` in relevant episodes
amanda-minter 1402f25
Update episodes/contact-matrices.Rmd
amanda-minter 2592254
Update contact-matrices.Rmd
amanda-minter 70a6051
add callout on splitting contact matrices using `{socialmixr}`
amanda-minter bf4d6bf
Update contact-matrices.Rmd
amanda-minter 3744e27
add callout to simulating transmission on normalisation
amanda-minter 62266cb
Update episodes/contact-matrices.Rmd
amanda-minter a252d00
add callout on notation
amanda-minter File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,11 +59,13 @@ contact: '[email protected]' | |
|
||
# Order of episodes in your lesson | ||
episodes: | ||
- contact-matrices.Rmd | ||
- simulating-transmission.Rmd | ||
- model-choices.Rmd | ||
- modelling-interventions.Rmd | ||
- compare-interventions.Rmd | ||
|
||
|
||
# Information for Learners | ||
learners: | ||
|
||
|
@@ -78,6 +80,6 @@ profiles: | |
# This space below is where custom yaml items (e.g. pinning | ||
# sandpaper and varnish versions) should live | ||
|
||
|
||
varnish: epiverse-trace/varnish@epiversetheme | ||
# this is carpentries/sandpaper#533 in our fork so we can keep it up to date with main | ||
sandpaper: epiverse-trace/sandpaper@patch-renv-github-bug |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,301 @@ | ||
--- | ||
title: 'Contact matrices' | ||
teaching: 40 | ||
exercises: 10 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What is a contact matrix? | ||
- How are contact matrices estimated? | ||
- How are contact matrices used in epidemiological analysis? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Use the R package `socialmixr` to estimate a contact matrix | ||
- Understand the different types of analysis contact matrices can be used for | ||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
<!-- ::::::::::::::::::::::::::::::::::::: prereq --> | ||
|
||
<!-- ## Prerequisites --> | ||
|
||
|
||
|
||
<!-- ::::::::::::::::::::::::::::::::: --> | ||
|
||
|
||
## Introduction | ||
|
||
Some groups of individuals have more contacts than others; the average schoolchild has many more daily contact than the average elderly person, for example. This heterogeneity of contact patterns between different groups can affect disease transmission, because certain groups are more likely to transmit to others within that group, as well as to other groups. The rate at which individuals within and between groups make contact with others can be summarised in a contact matrix. In this tutorial we are going to learn how contact matrices can be used in different analyses and how the `{socialmixr}` package can be used to estimate contact matrices. | ||
|
||
|
||
```{r,message=FALSE,warning=FALSE} | ||
library(socialmixr) | ||
``` | ||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor | ||
|
||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
## The contact matrix | ||
|
||
The basic contact matrix represents the amount of contact or mixing within and between different subgroups of a population. The subgroups are often age categories but can also be geographic areas or high/low risk groups. For example, a hypothetical contact matrix representing the average number of contacts per day between children and adults could be: | ||
|
||
$$ | ||
\begin{bmatrix} | ||
2 & 2\\ | ||
1 & 3 | ||
\end{bmatrix} | ||
$$ | ||
In this example, we would use this to represent that children meet, on average, 2 other children and 2 adult per day (first row), and adults meet, on average, 1 child and 3 other adults per day (second row). We can use this kind of information to account for the role heterogeneity in contact plays in infectious disease transmission. | ||
|
||
::::::::::::::::::::::::::::::::::::: callout | ||
|
||
### A Note on Notation | ||
For a contact matrix with rows $i$ and columns $j$, we call $C[i,j]$ the average number of contacts of group $i$ with group $j$, calculated as the number of contacts between the two groups averaged across all individuals in group $i$. | ||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
## Using `socialmixr` | ||
|
||
Contact matrices are commonly estimated from studies that use diaries to record interactions. For example, the POLYMOD survey measured contact patterns in 8 European countries using data on the location and duration of contacts reported by the study participants [(Mossong et al. 2008)](https://doi.org/10.1371/journal.pmed.0050074). | ||
|
||
The R package `{socialmixr}` contains functions which can estimate contact matrices from POLYMOD and other surveys. We can load the POLYMOD survey data: | ||
|
||
|
||
```{r polymod_, echo = TRUE} | ||
polymod <- socialmixr::polymod | ||
``` | ||
|
||
Then we can obtain the contact matrix for the age categories we want by specifying `age.limits`. | ||
|
||
```{r polymod_uk, echo = TRUE} | ||
contact_data <- contact_matrix( | ||
survey = polymod, | ||
countries = "United Kingdom", | ||
age.limits = c(0, 20, 40), | ||
symmetric = TRUE | ||
) | ||
contact_data | ||
``` | ||
|
||
|
||
|
||
**Note: although the contact matrix `contact_data$matrix` is not itself mathematically symmetric, it satisfies the condition that the total number of contacts of one group with another is the same as the reverse. In other words: | ||
`contact_data$matrix[j,i]*contact_data$demography$proportion[j] = contact_data$matrix[i,j]*contact_data$demography$proportion[i]`. | ||
For the mathematical explanation see [the corresponding section in the socialmixr documentation](https://epiforecasts.io/socialmixr/articles/socialmixr.html#symmetric-contact-matrices).** | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: callout | ||
### Why would a contact matrix be non-symmetric? | ||
|
||
One of the arguments we gave the function `contact_matrix()` is `symmetric=TRUE`. This ensures that the total number of contacts of age group 1 with age group 2 is the same as the total number of contacts of age group 2 and age group 1 (see the `socialmixr` [vignette](https://cran.r-project.org/web/packages/socialmixr/vignettes/socialmixr.html) for more detail). However, when contact matrices are estimated from surveys or other sources, the *reported* number of contacts may differ by age group resulting in a non-symmetric contact matrix because of biases in recall or reporting across different groups and/or uncertainty from using a limited sample of participants [(Prem et al 2021)](https://doi.org/10.1371/journal.pcbi.1009098). If `symmetric` is set to TRUE, the `contact_matrix()` function will internally use an average of reported contacts to ensure resulting total number of contacts are symmetric. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
The example above uses the POLYMOD survey. There are a number of surveys available in `socialmixr`, to list the available surveys use `list_surveys()`. To download a survey, we can use `get_survey()` | ||
|
||
```{r, message = FALSE, warning = FALSE} | ||
zambia_sa_survey <- get_survey("https://doi.org/10.5281/zenodo.3874675") | ||
``` | ||
|
||
|
||
|
||
::::::::::::::::::::::::::::::::::::: challenge | ||
|
||
## Zambia contact matrix | ||
|
||
After downloading the survey, generate a symmetric contact matrix for Zambia using the following age bins: | ||
|
||
+ [0,20) | ||
+ 20+ | ||
|
||
:::::::::::::::::::::::: solution | ||
|
||
```{r polymod_poland} | ||
contact_data_zambia <- contact_matrix( | ||
survey = zambia_sa_survey, | ||
age.limits = c(0, 20), | ||
symmetric = TRUE | ||
) | ||
contact_data_zambia | ||
``` | ||
::::::::::::::::::::::::::::::::: | ||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: callout | ||
## Synthetic contact matrices | ||
|
||
Contact matrices can be estimated from data obtained from diary (such as POLYMOD), survey or contact data, or synthetic ones can be used. [Prem et al. 2021](https://doi.org/10.1371/journal.pcbi.1009098) used the POLYMOD data within a Bayesian hierarchical model to project contact matrices for 177 other countries. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
|
||
|
||
|
||
## Analyses with contact matrices | ||
|
||
Contact matrices can be used in a wide range of epidemiological analyses, they can be used: | ||
|
||
+ to calculate the basic reproduction number while accounting for different rates of contacts between age groups [(Funk et al. 2019)](https://doi.org/10.1186/s12916-019-1413-7), | ||
+ to calculate final size of an epidemic, as in the R package `{finalsize}`, | ||
+ to assess the impact of interventions finding the relative change between pre and post intervention contact matrices to calculate the relative difference in $R_0$ [(Jarvis et al. 2020)](https://doi.org/10.1186/s12916-020-01597-8), | ||
+ and in mathematical models of transmission within a population, to account for group specific contact patterns. | ||
|
||
|
||
However, all of these applications require us to perform some additional calculations using the contact matrix. Specifically, there are two main calculations we often need to do: | ||
|
||
1. **Convert contact matrix into expected number of secondary cases** | ||
|
||
If contacts vary between groups, then the average number of secondary cases won't be equal simply to the average number of contacts multiplied by the probability of transmission-per-contact. This is because the average amount of transmission in each generation of infection isn't just a matter of whom a group came into contact with; it's about whom *their contacts* subsequently come into contact with. The function `r_eff` in the package `{finalsize}` can perform this conversion, taking a contact matrix, demography and proportion susceptible and converting it into an estimate of the average number of secondary cases generated by a typical infectious individual (i.e. the effective reproduction number). | ||
|
||
2. **Convert contact matrix into contact rates** | ||
|
||
Whereas a contact matrix gives the average number of contacts that one groups makes with another, epidemic dynamics in different groups depend on the rate at which one group infects another. We therefore need to scale the rate of interaction between different groups (i.e. the number of contacts per unit time) to get the rate of transmission. However, we need to be careful that we are defining transmission to and from each group correctly in any model. Specifically, the entry $(i,j)$ in a mathematical model contact matrix represents contacts of group $i$ with group $j$. But if we want to know the rate at which a group $i$ are getting infected, then we want to multiply the number of contacts of susceptibles in group $i$ ($S_i$) with group $j$ ($C[i,j]$) with the proportion of those contacts that are infectious ($I_j/N_j$), and the transmission risk per contact ($\beta$). | ||
|
||
### In mathematical models | ||
|
||
Consider the SIR model where individuals are categorized as either susceptible $S$, infected but not yet infectious $E$, infectious $I$ or recovered $R$. The schematic below shows the processes which describe the flow of individuals between the disease states $S$, $I$ and $R$ and the key parameters for each process. | ||
|
||
```{r diagram, echo = FALSE, message = FALSE} | ||
DiagrammeR::grViz("digraph { | ||
|
||
# graph statement | ||
################# | ||
graph [layout = dot, | ||
rankdir = LR, | ||
overlap = true, | ||
fontsize = 10] | ||
|
||
# nodes | ||
####### | ||
node [shape = square, | ||
fixedsize = true | ||
width = 1.3] | ||
|
||
S | ||
I | ||
R | ||
|
||
# edges | ||
####### | ||
S -> I [label = ' infection \n(transmission rate β)'] | ||
I -> R [label = ' recovery \n(recovery rate γ)'] | ||
|
||
}") | ||
``` | ||
|
||
The [differential equations](../learners/reference.md#ordinary) below describe how individuals move from one state to another [(Bjørnstad et al. 2020)](https://doi.org/10.1038/s41592-020-0822-z). | ||
|
||
|
||
$$ | ||
\begin{aligned} | ||
\frac{dS}{dt} & = - \beta S I /N \\ | ||
\frac{dI}{dt} &= \beta S I /N - \gamma I \\ | ||
\frac{dR}{dt} &=\gamma I \\ | ||
\end{aligned} | ||
$$ | ||
To add age structure to our model, we need to add additional equations for the infection states $S$, $I$ and $R$ for each age group $i$. If we want to assume that there is heterogeneity in contacts between age groups then we must adapt the transmission term $\beta SI$ to include the contact matrix $C$ as follows : | ||
|
||
$$ \beta S_i \sum_j C_{i,j} I_j/N_j. $$ | ||
|
||
Susceptible individuals in age group $i$ become infected dependent on their rate of contact with individuals in each age group. For each disease state ($S$, $E$, $I$ and $R$) and age group ($i$), we have a differential equation describing the rate of change with respect to time. | ||
|
||
$$ | ||
\begin{aligned} | ||
\frac{dS_i}{dt} & = - \beta S_i \sum_j C_{i,j} I_j/N_j \\ | ||
\frac{dI_i}{dt} &= \beta S_i\sum_j C_{i,j} I_j/N_j - \gamma I_i \\ | ||
\frac{dR_i}{dt} &=\gamma I_i \\ | ||
\end{aligned} | ||
$$ | ||
|
||
|
||
### Normalising the contact matrix to ensure the correct value of $R_0$ | ||
|
||
When simulating an epidemic, we often want to ensure that the average number of secondary cases generated by a typical infectious individual (i.e. $R_0$) is consistent with known values for the pathogen we're analysing. In the above model, we scale the contact matrix by the $\beta$ to convert the raw interaction data into a transmission rate. But how do we define the value of $\beta$ to ensure a certain value of $R_0$? | ||
|
||
Rather than just using the raw number of contacts, we can instead normalise the contact matrix to make it easier to work in terms of $R_0$. In particular, we normalise the matrix by scaling it so that if we were to calculate the average number of secondary cases based on this normalised matrix, the result would be 1 (in mathematical terms, we are scaling the matrix so the largest eigenvalue is 1). This transformation scales the entries but preserves their relative values. | ||
|
||
In the case of the above model, we want to define $\beta C_{i,j}$ so that the model has a specified valued of $R_0$. The entry of the contact matrix $C[i,j]$ represents the contacts between populations $i$ and $j$, which is equivalent to `contact_data$matrix[j,i]` so the first step is to transpose the contact data matrix (`contact_data$matrix`) so the row/column entries are now in the form $C[i,j]$. Then we normalise the matrix $C$ so the maximum eigenvalue is one and call this matrix $C_normalised$. Because the rate of recovery is $\gamma$, individuals will be infectious on average for $1/\gamma$ days. So $\beta$ is calculated from the scaling factor and the value of $\gamma$ (i.e. mathematically we have the dominant eigenvalue of the matrix $R_0 \times C_{normalised}$ is $\beta / \gamma$). | ||
|
||
```{r} | ||
contact_matrix <- t(contact_data$matrix) | ||
scaling_factor <- 1 / max(eigen(contact_matrix)$values) | ||
normalised_matrix <- contact_matrix * scaling_factor | ||
``` | ||
|
||
As a result, if we multiply the scaled matrix by $R_0$, then converting to the number of expected secondary cases would give us $R_0$, as required. | ||
|
||
|
||
```{r} | ||
infectious_period <- 7.0 | ||
basic_reproduction <- 1.46 | ||
transmission_rate <- basic_reproduction * scaling_factor / infectious_period | ||
# check the dominant eigenvalue of R0 x C_normalised is R0 | ||
max(eigen(basic_reproduction * normalised_matrix)$values) | ||
``` | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: callout | ||
### Normalisation using `socialmixr` | ||
|
||
Normalisation can be performed by the function `contact_matrix()` in `{socialmixr}`. To obtain the normalised matrix we must specify that we want to split out the different components of the contact matrix using the argument `split = TRUE`. Then we can obtain the normalised matrix as follows: | ||
|
||
```{r, message = FALSE} | ||
contact_data_split <- contact_matrix( | ||
survey = polymod, | ||
countries = "United Kingdom", | ||
age.limits = c(0, 20, 40), | ||
symmetric = TRUE, | ||
split = TRUE | ||
) | ||
|
||
# extract components of the contact matrix | ||
contacts_d <- contact_data_split$contacts | ||
matrix_a <- contact_data_split$matrix | ||
demography_n <- contact_data_split$demography$proportion | ||
|
||
# calculate normalised matrix | ||
normalised_matrix_split <- contacts_d * matrix_a * demography_n | ||
``` | ||
|
||
|
||
For details of the different components of the contact matrix see [the package vignette on splitting contact matrices.](https://cran.r-project.org/web/packages/socialmixr/vignettes/socialmixr.html#splitting-contact-matrices) | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: callout | ||
### Check the dimension of $\beta$ | ||
|
||
In the SIR model without age structure the rate of contact is part of the transmission rate $\beta$, where as in the age-structured model we have separated out the rate of contact, hence the transmission rate $\beta$ in the age structured model will have a different value. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
We can use contact matrices from `socialmixr` with mathematical models in the R package `{epidemics}`. See the tutorial [Simulating transmission](../episodes/simulating-transmission.md) for examples and an introduction to `epidemics`. | ||
|
||
|
||
### Contact groups | ||
|
||
In the example above the dimension of the contact matrix will be the same as the number of age groups i.e. if there are 3 age groups then the contact matrix will have 3 rows and 3 columns. Contact matrices can be used for other groups as long as the dimension of the matrix matches the number of groups. | ||
|
||
For example, we might have a meta population model with two geographic areas. Then our contact matrix would be a 2 x 2 matrix with entries representing the contact between and within the geographic areas. | ||
|
||
|
||
|
||
## Summary | ||
|
||
In this tutorial, we have learnt the definition of the contact matrix, how they are estimated and how to access social contact data from `socialmixr`. In the next tutorial, we will learn how to use the R package `{epidemics}` to generate disease trajectories from mathematical models with contact matrices from `socialmixr`. | ||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- `socialmixr` can be used to estimate contact matrices from survey data | ||
- Contact matrices can be used in different types of analyses | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: I think this is the wrong way around - C[i, j] in socialmixr is as in the equations above I think (but please correct me if I'm wrong).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's the same then why do we need to transpose the matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always find$i$ and $j$ a potential headache (which it's why this will be so useful to have written down!)
Taking step back, we just need the FOI to be defined sensibly, i.e.:$\sum_j C_{i,j} I_j/N_j$ . So $C_{i,j}$ should be contacts that group $j$ (the infectious ones) make with group $i$ (the susceptible ones) - this is from equation A3 in Wallinga et al (2006)
The
contact_matrix()
function gives the following structure:So$C[i,j]$ is contacts made by group $i$ with group $j$ ? Which I think would mean it needs transposing?
For completeness (and just to remind myself), {epidemics} has this internal processing in$w_{tot}/w_i$ (to use the Walling et al notation):
.prepare_population()
, which normalises by dominant eigenvalue and scales based onThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be the other way round, i.e.$C_{ij}$ here is the average number of contacts with group $j$ that a suspectible in group $i$ has (then multiplied with the probability that the contact is infectious, i.e. $I_j/N_j$ )?
Here's an example:
There are two groups,$i$ and $j$ . There are 1 person in group $i$ ($N_i=1$ ) and 100 people in group $j$ ($N_j = 100$ ). Person $i$ meets all 100 people in group $j$ every day: according to socialmixr notation that means $C_{ij}=100$ and $C_{ji}=1$ (on the scale of days). The total number of contacts between the two groups per day is $C_{ij} N_i = C_{ji} N_j = 100$ .
Sticking with this notation:$i$ is proportional to 100 * (proportion of $j$ that is ill), or $C_{ij} I_j / N_j$ $j$ is proportional to 1 * (1 if $i$ is ill, 0 otherwise), or $C_{ji} I_i / N_i$
The FOI on person
The FOI on people in group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. So it basically comes down to whether the$N_j$ (or symmetry-derived equivalent) is wrapped into the $\beta_{ij}$ term. If defined separately, i.e. $\sum_j C_{i,j} I_j/N_j$ as above, then as you say, contact rate should be defined from-S-to-I.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is useful for the purpose of teaching but I find it easier sometimes to work from the symmetric encounter matrix, i.e. the number of encounters between group$i$ and group $j$ per unit time. If we call this $E_{i,j}$ then it is symmetric $E_{i,j}= E_{j,i}$ and so is the term in the force of infection which is proportional to $\frac{E_{ij}I_j}{N_iN_j}$ . This highlights that the row vs. column notation is purely about how the matrix is normalised i.e. whether we write it as $\frac{E_{ij}}{N_i}$ or $\frac{E_{ij}}{N_j}$ (which thus determines which of the $N$ terms remains in the force of infection) and not about contacts from/to etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone through the {epidemics} and {finalsize} examples a bit more, thinking about how these steps are introduced in vignettes. The challenge is that these packages allow user to specify in terms of$R_0$ (which is useful), but that means normalising the matrix, then converting back into the correct form for the contact rate you describe above @sbfnk . This is where I think the transpose comes in, so that it's switching between $C_{ij}=R_{ij}$ for the eigenvalue normalisation and $C_{ij}/N_j$ (i.e. contact rate per capita) for the model.
But because the result is the symmetric per capita matrix that goes into the model, the end result is equivalent:
So in terms of implications for this episode, the main thing is just to make sure that the notation for the matrix in the model is in terms of contacts susceptibles make with infectives?
Bringing it back to those two key distinctions (which we could tweak to mention, give above discusssion!):
Convert contact matrix into expected number of secondary cases The R calculation involves an infective meeting susceptibles
Convert contact matrix into contact rates The epidemic model involves a susceptible meeting infectives
Also tagging @rozeggo and @BlackEdder for info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need to distinguish between the two? Given that the eigenvalues will be invariant under transpose (1) could be done in either orientation (as could (2) if the index of the normalising population size is swapped).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's probably a simpler framing we could use, but think need to explain the two steps (eigenvalue calc, then normalisation by demography over correct matrix dimension), otherwise could lead people to assume the following is correct?
Perhaps the below is clearest, because doesn't involve any explicit transposes or normalisation? Just need to explain intuitively what the two matrices represent?