-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dcm ngm primer #53
base: main
Are you sure you want to change the base?
Dcm ngm primer #53
Conversation
No where even close to done, just making it a draft right now so I can use PR tools. |
Also, there will be so many commits on this as I figure my way through how I want to write about this in a way that's hopefully clear and useful to others. I'll plan to squash as many commits as I think reasonable closer to when a draft is complete. |
4abfdaf
to
940b8dd
Compare
changing primer name and topic shift adding relationship to branching processes changing language, will squash later
language change language change v not m matrix to and from direction reorg material more careful language
commenting out some lines, new text
clearning up use as model section other DFE cleaning up rescaling explanation add context about spectral radius commenting out different population sizes part for now unfinished sections named
940b8dd
to
ef458ff
Compare
… the actual NGM in the model
… definition removing unneeded commented lines removing separate in this repo section
ad16daa
to
3bc5762
Compare
This feels like a fairly comprehensive review of NGMs. Fwiw, the formulation in the approach doc, |
docs/primer.md
Outdated
$ | ||
\mathbf{x} = | ||
\left(\begin{array}{cc} | ||
I_H\\ | ||
I_L | ||
\end{array}\right) | ||
$, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this rendering properly, you can (a) use the math
block or (b) do the following:
$ | |
\mathbf{x} = | |
\left(\begin{array}{cc} | |
I_H\\ | |
I_L | |
\end{array}\right) | |
$, | |
$\mathbf{x} = \left(I_H \quad I_L\right)^{\text{t}}$, |
The same happens with the equations after this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's odd, how are you viewing this? I've been opening with preview in VSCode and I'm not running into any rendering issues. Let me know how so I can reproduce the issue, try your suggestion, and verify that it works. Otherwise, there might be too much back and forth on how to fix this. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I see it in github's preview now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will point out that we have things set up to work with mkdocs
so we don't have to be beholden to Github-flavored MD if we don't want to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we can do the math block option but I'm trying to see if there's a way to do the math inline so bear with me for a little while I figure this out. Thanks! Not super familiar with mkdocs
- if we went that route, would the math render properly when viewed in GitHub? I think that's the important bit here for online readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Github can only render github-flavored MD, but mkdocs
can do more. So it wouldn't render here, but it could render on a documentation website.
We do this in cladecombiner. With a few lines of configuration in mkdocs.yml and a few files in .github/actions
and .github/workflows
, everything in /docs
gets built into a website alongside the API reference automatically, (and checked on PRs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have gotten tired enough of GHFMD that I am starting to find this worth the (small) effort to get rolling and (small) inconvenience that not all the math renders directly on GH.
|
||
NGM models describe infectious disease dynamics as a demographic process in the sense that each consecutive generation produces new offspring infections. This can be a good approximation for dynamics early on when the population can be roughly described as fully susceptible. However, unlike ODE models, an NGM model does not account for the fixed size of a population and cannot model the depletion of susceptibles over time. | ||
|
||
### Other conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to mention population size -- that we expect these relationships to hold for large-ish populations that can be approximated with averages?
NGM models describe infectious disease dynamics as a demographic process in the sense that each consecutive generation produces new offspring infections. This can be a good approximation for dynamics early on when the population can be roughly described as fully susceptible. However, unlike ODE models, an NGM model does not account for the fixed size of a population and cannot model the depletion of susceptibles over time. | ||
|
||
### Other conditions | ||
The NGM must be non-negative to guarantee that $R_0$ will be a single unique, positive real-valued eigenvalue of $\mathbf{R}$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NGM must be non-negative to guarantee that $R_0$ will be a single unique, positive real-valued eigenvalue of $\mathbf{R}$. | |
Entries of the NGM must be non-negative to guarantee that $R_0$ will be a single unique, positive real-valued eigenvalue of $\mathbf{R}$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty standard AFAIK that a non-negative matrix is one whose entries are non-negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, is that all that's required? My linear algebra is bad, but I thought that it was more complicated. (I don't know that this needs to be explained in-text, but I do think a reference/link would be nice for dummies like me.)
\end{array}\right) | ||
$ for the state $I_L$ in the transmission matrix $\mathbf{T}$. | ||
|
||
Then the NGM can be defined as $\mathbf{R}$ with elements $R_{ij} = \frac{\beta_{ij}N_i}{\gamma N}$. This is the formulation used for the NGM model in this repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we decided to generalize this NGM primer away from the widget? If not, it feels like there is a small disconnect between what they actually enter in the widget (which assumes population sizes have already been factored into the entries of the NGM) and this text. If we want to link this primer and the widget a bit more, we may want to show how vaccination gets factored into this (how proportion susceptible enters the calculation).
Then the NGM can be defined as $\mathbf{R}$ with elements $R_{ij} = \frac{\beta_{ij}N_i}{\gamma N}$. This is the formulation used for the NGM model in this repository. | |
Then the NGM can be defined as $\mathbf{R}$ with elements $R_{ij} = \frac{\beta_{ij}N_i}{\gamma N}$. This is the formulation used for the input NGM in the widget, noting the implicit assumption that the user has provided entries to the input NGM that factor in population sizes. Vaccination alters the proportion of susceptible individuals that may become infected in each group, thus the rows of the input NGM are multiplied by the remaining proportion susceptible after vaccination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion, can add a section that speaks to the vaccination application too. I thought that the approach doc was sort of doing this already - what's something more fleshed out that you would want to see in the primer doc beyond what's in the approach doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the overlap here and I'm now recalling that this NGM primer was supposed to be more of a stand alone resource than the user guide and approach document and we had even discussed moving the NGM primer to the CFA recipes repo at one point? So maybe reject this suggestion in favor or keeping this primer very general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is helpful! There are a number of minor math formatting issues in Github as already noted.
I like the list of assumptions - can we make it more of a compact list rather than subsections?
Do you think it could be helpful to write some sort of glossary explaining the main terms used here (spectral radius, eigenvalues, eigenvectors, transition matrix, transmission matrix, DFE) and how they relate to disease models?
I think if the audience is Andy and me before this project, I could have used a more concrete example showing an example NGM itself, the way you get R0 and infection distribution (which means we need a bit about the eigenvector of the NGM) and the way you think about adding in vaccination by factoring rows by the proportion susceptible remaining. The example now is a bit confusing, since it references K&R but only in that it arrives at the NGM of beta/gamma * N_i / N. I guess the helpful thing from K&R for me was validating that we understood what exactly was the NGM, and corresponding R0 and infection distribution. Without numbers, I think the example now still could have been a bit mystifying for me a month ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put most of my thoughts in #70
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. The more explicit "here's how to go from actual compartment sizes" version of the KR example is especially helpful to me.
Ironically this has made things clear enough for me to realize there are still some decent-sized gaps in my understanding that I think may confuse others as well (though it could just be me). I've tried to highlight those, most of which I think only need a few extra words' worth of explanation for me to really grok.
Other than that, I've highlighted a few places I think there may be small notational issues
@@ -0,0 +1,106 @@ | |||
# A Primer on Next Generation Matrix Models | |||
|
|||
A Next Generation Matrix model is a way to model the expected number of infections generated by a typical infected individual in different groups or categories of the population in consecutive generations. The Next Generation Matrix (here after referred to as the NGM) encodes this information. NGM models are an effective way to model average dynamics in a heterogeneous population during the early growth phase and in the limit of the disease-free equilibrium. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A Next Generation Matrix model is a way to model the expected number of infections generated by a typical infected individual in different groups or categories of the population in consecutive generations. The Next Generation Matrix (here after referred to as the NGM) encodes this information. NGM models are an effective way to model average dynamics in a heterogeneous population during the early growth phase and in the limit of the disease-free equilibrium. | |
A Next Generation Matrix model is a way to model the expected number of infections generated by a typical infected individual in different groups or categories of the population in consecutive generations. The Next Generation Matrix (hereafter referred to as the NGM) encodes this information. NGM models are an effective way to model average dynamics in a heterogeneous population during the early growth phase and in the limit of the disease-free equilibrium. |
|
||
A Next Generation Matrix model is a way to model the expected number of infections generated by a typical infected individual in different groups or categories of the population in consecutive generations. The Next Generation Matrix (here after referred to as the NGM) encodes this information. NGM models are an effective way to model average dynamics in a heterogeneous population during the early growth phase and in the limit of the disease-free equilibrium. | ||
|
||
An NGM model is related to the branching process concept of an offspring distribution generated by an individual. In this context, the NGM represents the expected value of the offspring distribution, or in this case, the distribution of infections caused in a group from a typical infectious individual in another group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow the expected value bit (each column is the expected count of infections from one individual in one category into all groups), but I don't get "in this case the distribution of infections" bit. To me, the "distribution" would be normalized (sum to one).
As a result, most modelers familiar with NGMs have experience with using them as an analytical tool rather than as a simulation tool. However, NGMs can also be used to approximately model the ODEs for the subsystem of infected states. | ||
|
||
## Interpretation of matrix elements | ||
Imagine we have an NGM, $\mathbf{R} = [R_{ij}]$. The elements $R_{ij}$ of this matrix can be interpreted as the average number of infections in group $i$ caused by an infected individual in group $j$ between consecutive generations in a fully susceptible population. As a rule of thumb, the matrix $\mathbf{R}$ is not symmetric; some groups may be more susceptible to infection or more transmissive resulting in an asymmetric \mathbf{R}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more used to seeing curly braces here (R = {R_ij})? But I can't seem to get GHFMD to render those, so, maybe that's why they're square? Note that GHFMD can't render mathbf
either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am realizing now that I am actually kind of confused about the relationship this "the NGM is just R" approach and the "get an NGM from a compartmental model" approach.
The distinction between the two matrices below of small and large domain makes me think that we can only do this for models where there is only a single state at infection for each group. While that's true of the models that come to mind readily, is it guaranteed to be true of all compartmental models with multiple groups? A quick survey produces the models in figure 6 here and figure 2f here which make me think not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: seems to assume only one compartment in downstream groups can cause new infections in other groups? Like, I'm having some trouble seeing how I would have to slice and dice the model to think this way about a multi-group model which distinguishes between symptomatic and asymptomatic cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dinacmistry I think this is an interesting scope question. I think it's useful to have a document that explains what an NGM is, and shows that it's something that can be either asserted & analyzed OR derived from ODEs. @afmagee42 's questions are valid, but I find it hard to imagine that a CFA project is going to include someone computing NGMs from a complex, multicompartment model.
Another way to say it is: my hypothesis is that: the Dieckmann et al. way of thinking is potentially useful for education/demonstration, but it's not the way CFA is actually going to produce or analyze any of its models. (This is where Dylan Morris would write "<ducks>
")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there's a benefit of being able to understand it both ways. If one knows how an NGM comes out of a compartmental model, then it should be easier to see the limitations and assumptions that go into it from a perspective closer to that of a lot of epidemiologically-trained folks.
As it is right now I get it as a pure exponential growth branching process model, but if you asked me whether the behavior would be different between assuming new infections are immediately infectious or not, I couldn't tell you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point: the "forward" way of thinking says "if I have a fixed NGM, then I'll necessarily get exponential growth" while the inference/analytical way of thinking says "I'll can the NGM that emerges from these equations, at a particular instant in time, and linearize around the DFE, to get an approximation of growth rate, knowing that the NGM will necessarily change in the next instant"
Only say that less confusingly
As a result, most modelers familiar with NGMs have experience with using them as an analytical tool rather than as a simulation tool. However, NGMs can also be used to approximately model the ODEs for the subsystem of infected states. | ||
|
||
## Interpretation of matrix elements | ||
Imagine we have an NGM, $\mathbf{R} = [R_{ij}]$. The elements $R_{ij}$ of this matrix can be interpreted as the average number of infections in group $i$ caused by an infected individual in group $j$ between consecutive generations in a fully susceptible population. As a rule of thumb, the matrix $\mathbf{R}$ is not symmetric; some groups may be more susceptible to infection or more transmissive resulting in an asymmetric \mathbf{R}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine we have an NGM, $\mathbf{R} = [R_{ij}]$. The elements $R_{ij}$ of this matrix can be interpreted as the average number of infections in group $i$ caused by an infected individual in group $j$ between consecutive generations in a fully susceptible population. As a rule of thumb, the matrix $\mathbf{R}$ is not symmetric; some groups may be more susceptible to infection or more transmissive resulting in an asymmetric \mathbf{R}. | |
Imagine we have an NGM, $\mathbf{R} = [R_{ij}]$. The elements $R_{ij}$ of this matrix can be interpreted as the average number of infections in group $i$ caused by an infected individual in group $j$ between consecutive generations in a fully susceptible population. As a rule of thumb, the matrix $\mathbf{R}$ is not symmetric; some groups may be more susceptible to infection or more transmissive resulting in an asymmetric $\mathbf{R}$. |
NGM models describe infectious disease dynamics as a demographic process in the sense that each consecutive generation produces new offspring infections. This can be a good approximation for dynamics early on when the population can be roughly described as fully susceptible. However, unlike ODE models, an NGM model does not account for the fixed size of a population and cannot model the depletion of susceptibles over time. | ||
|
||
### Other conditions | ||
The NGM must be non-negative to guarantee that $R_0$ will be a single unique, positive real-valued eigenvalue of $\mathbf{R}$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty standard AFAIK that a non-negative matrix is one whose entries are non-negative.
|
||
The NGM $\mathbf{K}$ is the restriction of $\mathbf{R_L}$ to the subset of states-at-infection. An auxiliary matrix $\mathbf{E}$ can be defined whose columns are unit vectors for each non-zero row of the matrix $T$. The NGM can then be computed as $\mathbf{R} = -\mathbf{E}'\mathbf{T}\mathbf{\Sigma}^{-1}\mathbf{E}$, $\mathbf{E}'$ is the transpose of $\mathbf{E}$. It can be shown that the spectral radius of $\mathbf{R_L}$ is equal to that of $mathbf{R}$ and that this spectral radius is $R_0$. | ||
|
||
In most cases, more intuitive approaches can be used to define the NGM, however the formal definition of $mathbf{R}$ has its advantages in being more rigorous and and helping modelers identify relevant information for estimating growth dynamics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases, more intuitive approaches can be used to define the NGM, however the formal definition of $mathbf{R}$ has its advantages in being more rigorous and and helping modelers identify relevant information for estimating growth dynamics. | |
In most cases, more intuitive approaches can be used to define the NGM, however the formal definition of $\mathbf{R}$ has its advantages in being more rigorous and and helping modelers identify relevant information for estimating growth dynamics. |
|
||
Unlike the example in Keeling & Rohani, here we model the counts of the population in each state rather than the proportion. We are also modeling the effective rate of transmission between groups as split into two factors: a rate of transmission from group $j$ to group $i$, $\beta_{ij}$ and a rate of interaction based on the number of people in the population available for contact with infectious individuals, i.e. $\frac{S_i}{N}$. This follows from the frequency dependent assumption where effective contact structure that generates transmission is independent of population size (the interested reader can refer to Keeling & Rohani, 2008 pp 17-18 for more details). | ||
|
||
At any given time, there is some fraction of the population that is susceptible in group $i$ and can be infected through interaction with an infected individual in group $j$. Then the average number of infections generated by this individual in group $i$ is $\frac{\beta_{ij}S_i}{N}$ per unit time. Assuming no collision of transmission events, $I_j$ infected individuals produce $\frac{\beta_{ij}S_i I_j}{N}$ infections per unit time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"susceptible in group
Are we sure this is right? This sounds like it's talking about a susceptible generating infections in its own group
I_H\\ | ||
I_L | ||
\end{array}\right)$, | ||
$\mathbf{T} = [T_{ij}]$ with $T_{ij} = \frac{\beta_{ij}N_i}{N}$, and $\mathbf{\Sigma} = \gamma \mathbb{1}$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be Sigma = -gamma? Or is this why the equation above has R = -T Sigma?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also I think that (sadly unrendered on GH) \mathbb{1}
is intended to be a (similarly unrendered on GH) \mathbf{1}
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should perhaps call attention to the substitution here of
The reader has been warned it's coming (DFE and all that) but it's an important step.
For this system, the auxiliary matrix is | ||
$\mathbf{E} = | ||
\left(\begin{array}{cc} | ||
1 & 0\\ | ||
0 & 1 | ||
\end{array}\right | ||
)$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you, but I don't know how to work it out for myself
### Other conditions | ||
The NGM must be non-negative to guarantee that $R_0$ will be a single unique, positive real-valued eigenvalue of $\mathbf{R}$. | ||
|
||
## Formal definition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that I am much closer to understanding what is going on with NGMs (as derived from compartmental models) than I was before, but I could use a little help getting the rest of the way.
- I think I handwavily see why the large domain formulation (modulo the sign) is what it is: we're tracking flow into and out of the compartments, and it's got that (rate into I)/(rate out of I) sort of form
- But we don't need to track everything, which is where K comes in
- I don't quite get why this is just the states at infection we care about, I'd think the states at infectiousness would also enter into the picture?
- There's a way to get this smaller matrix from the larger one by ignoring the unimportant bits, which is where E comes in, but I don't quite see in general how to formulate E. It's (# states at infection) x (# states), so we're doing some sort of mapping from the full
$R_L$ down to just the states at infection as promised. But what mapping?
- For the linear-algebraically challenged, is there handwavy intuition for why the spectral radius is
$R_0$ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afmagee42 re: last bullet, this is what I tried to do in #70 , to give some motivation there. (But there's a lot of material in #70 ; we might want some or none of it.)
No description provided.