-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #127 #152
Fixes #127 #152
Conversation
bahadzie
commented
Jan 15, 2024
•
edited
Loading
edited
- Fixes issue Calculate expected vaccinations and check model correctness #125
- Fixes issue Add function to calculate timing and size of epidemic peak #127
Hi @bahadzie, rather than removing the files in More info in the dedicated testthat vignette: https://testthat.r-lib.org/articles/snapshotting.html |
No, this was just a drive by comment based on the check results. I have not reviewed the code. I'll leave it up to your expertise. |
Okay thanks, we have a P3 meeting scheduled later so will discuss next steps for this PR then. Edit add: of course we'll restore the snapshots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bahadzie, thanks for making this PR and taking a stab at this issue.
This function works mostly alright for the case of a single demographic group in a model where time increments by 1.0, but would not find the correct epidemic peak time for time increments other than 1.0 (I think), nor for multiple demographic groups. This is because of how timing <- which(wide %in% peak)
is calculated - you're getting the index, not the time value. I've attached an example from the vignettes below - the figure shows all three peaks at about 370, but the function calculates peaks of 367, 972, and 1576. I've made some suggestions for edits to the function which should make it more readable and perform as expected.
- This function can be collapsed, for its simplest case in a single-peak epidemic, by using convenient {data.table} syntax from the suggestion.
- You should also add tests for this function. There should be tests for
- the output type, dimensions, and predictable column contents (e.g. demography group names) and types (timing and value should be numerics);
- time and peak value limits (i.e., 0 < peak time < max time;
n_initially_infected
$\leq$ peak value < max population size; etc.); - predictable cases (e.g., a case where
infectiousness = 0.0
, so the 'peak' is at$t = 0$ ; a case where an intervention is applied to one demographic group whose peak is then later than other groups; etc.). - You should revert commit 9a349b2, as these snapshots are important and must be retained.
library(epidemics)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
# load contact and population data from socialmixr::polymod
polymod <- socialmixr::polymod
contact_data <- socialmixr::contact_matrix(
polymod,
countries = "United Kingdom",
age.limits = c(0, 20, 40),
symmetric = TRUE
)
#> Using POLYMOD social contact data. To cite this in a publication, use the 'cite' function
#> Removing participants that have contacts without age information. To change this behaviour, set the 'missing.contact.age' option
# prepare contact matrix
contact_matrix <- t(contact_data$matrix)
# prepare the demography vector
demography_vector <- contact_data$demography$population
names(demography_vector) <- rownames(contact_matrix)
# view contact matrix and demography
contact_matrix
#>
#> contact.age.group [,1] [,2] [,3]
#> [0,20) 7.883663 2.794154 1.565665
#> [20,40) 3.120220 4.854839 2.624868
#> 40+ 3.063895 4.599893 5.005571
demography_vector
#> [0,20) [20,40) 40+
#> 14799290 16526302 28961159
# initial conditions
initial_i <- 1e-6
initial_conditions <- c(
S = 1 - initial_i, E = 0, I = initial_i, R = 0, V = 0
)
# build for all age groups
initial_conditions <- rbind(
initial_conditions,
initial_conditions,
initial_conditions
)
# assign rownames for clarity
rownames(initial_conditions) <- rownames(contact_matrix)
# view initial conditions
initial_conditions
#> S E I R V
#> [0,20) 0.999999 0 1e-06 0 0
#> [20,40) 0.999999 0 1e-06 0 0
#> 40+ 0.999999 0 1e-06 0 0
uk_population <- population(
name = "UK",
contact_matrix = contact_matrix,
demography_vector = demography_vector,
initial_conditions = initial_conditions
)
uk_population
#> <population> object
#>
#> Population name:
#> "UK"
#>
#> Demography
#> [0,20): 14,799,290 (20%)
#> [20,40): 16,526,302 (30%)
#> 40+: 28,961,159 (50%)
#>
#> Contact matrix
#>
#> contact.age.group [0,20) [20,40) 40+
#> [0,20) 7.883663 2.794154 1.565665
#> [20,40) 3.120220 4.854839 2.624868
#> 40+ 3.063895 4.599893 5.005571
# run epidemic simulation with no vaccination or intervention
data <- model_default_cpp(
population = uk_population,
time_end = 600, increment = 1.0
)
# plot figure of epidemic curve
filter(data, compartment %in% c("exposed", "infectious")) %>%
ggplot(
aes(
x = time,
y = value,
col = demography_group,
linetype = compartment
)
) +
geom_line() +
labs(
x = "Simulation time (days)",
linetype = "Compartment",
y = "Individuals"
)
# get the timing and peak of the infectious compartment
epidemic_peak(data)
#> timing peak
#> [0,20) 367 384139.9
#> [20,40) 972 357215.9
#> 40+ 1576 474553.3
Created on 2024-01-22 with reprex v2.0.2
R/helpers.R
Outdated
|
||
#' Get the timing and size of a compartment's peak | ||
#' | ||
#' Get the timing and size of a compartment's peak for all demographic groups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please update this to clarify that the peak found is the highest peak. For example, in an epidemic with two peaks such as this example of seasonality in infectiousness, the function would find the second peak.
- Please add a Details section laying out the use case, and the important caveats; e.g., best used for a single peak epidemic etc.
R/helpers.R
Outdated
#' | ||
#' @param data A `<data.frame>` of model output, typically | ||
#' the output of [model_default_cpp()] or similar functions. | ||
#' @param compartment The compartment of interest ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' @param compartment The compartment of interest ; | |
#' @param compartment A character vector for the compartments of interest. |
Making this a character vector allows you to find peaks in multiple compartments at once, e.g. hospitalised and deaths.
R/helpers.R
Outdated
#' @param data A `<data.frame>` of model output, typically | ||
#' the output of [model_default_cpp()] or similar functions. | ||
#' @param compartment The compartment of interest ; | ||
#' @return A `<data.frame>` of the timing and peak of the selected compartment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mention the column names in this data.frame.
R/helpers.R
Outdated
#' | ||
#' # run epidemic simulation with no vaccination or intervention | ||
#' data <- model_default_cpp( | ||
#' population = uk_population |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' population = uk_population | |
#' population = uk_population, time_end = 600 |
Increasing the time_end allows the true peak to be found, as the default value time_end = 100
is too low for the epidemic to peak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
R/helpers.R
Outdated
colnames(data), | ||
must.include = c("time", "demography_group", "compartment", "value") | ||
) | ||
checkmate::assert_character(compartment, len = 1L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkmate::assert_character(compartment, len = 1L) | |
checkmate::assert_character( | |
compartments, min.len = 1, | |
any.missing = FALSE, unique = TRUE, | |
null.ok = FALSE | |
) |
Allow more than one compartment to be added, allow only unique compartments, and prevent missing values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
R/helpers.R
Outdated
age_groups <- unique(data[["demography_group"]]) | ||
n_groups <- length(age_groups) | ||
columns <- c("demography_group", "time", "value") | ||
# compartment of interest | ||
long_data <- data[data[["compartment"]] == compartment, columns] | ||
nrows <- length(long_data[, "value"]) | ||
# long to wide conversion | ||
wide <- matrix(long_data[, "value"], nrows / n_groups, n_groups, byrow = TRUE) | ||
colnames(wide) <- age_groups | ||
|
||
col_max <- function(data) sapply(as.data.frame(data), max, na.rm = TRUE) | ||
peak <- col_max(as.data.frame(wide)) | ||
timing <- which(wide %in% peak) | ||
|
||
cbind(timing, peak) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
age_groups <- unique(data[["demography_group"]]) | |
n_groups <- length(age_groups) | |
columns <- c("demography_group", "time", "value") | |
# compartment of interest | |
long_data <- data[data[["compartment"]] == compartment, columns] | |
nrows <- length(long_data[, "value"]) | |
# long to wide conversion | |
wide <- matrix(long_data[, "value"], nrows / n_groups, n_groups, byrow = TRUE) | |
colnames(wide) <- age_groups | |
col_max <- function(data) sapply(as.data.frame(data), max, na.rm = TRUE) | |
peak <- col_max(as.data.frame(wide)) | |
timing <- which(wide %in% peak) | |
cbind(timing, peak) | |
data.table::setDT(data) | |
data = data[compartment %in% compartments, | |
list(time = data.table::first(time[value == max(value)]), | |
value = data.table::first(max(value))), | |
by = c("demography_group", "compartment") | |
] | |
data.table::setDF(data)[] |
This code using {data.table} should give you the highest peak time and value for each demographic group. Suggest using first()
or similar to prevent multiple maximum values from being found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Although I get a note about undefined global functions or variables compartment
, time
and value
.
Not sure how to adjust the {data.table}
syntax to pass {lintr}
rules
R/helpers.R
Outdated
#' | ||
#' # get the timing and peak of the exposed compartment | ||
#' epidemic_peak(data, compartment = "exposed") | ||
epidemic_peak <- function(data, compartment = "infectious") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epidemic_peak <- function(data, compartment = "infectious") { | |
epidemic_peak <- function(data, compartments = "infectious") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
- Use data.table syntax from suggestion - Add tests for function - Revert snapshots from previous commit Fixes #125: Calculate expected vaccinations and check model correctness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bahadzie, thanks for adding these changes. Could you:
- Update the PR description to say what the changes are and ideally tag the issues being fixed;
- Rebase this branch on
main
; - Update your version of {testthat} (v3.2.1 appears latest) as this seems to be causing changes to the test snapshots; you would also need to revert changes to the test snapshots, and then check if any snapshots really need to be updated. In general, I would be very conservative in accepting changes to a test snapshot, and I would always mention why in the PR or in the commit message;
- Update the code in
epidemic_peak()
to access columns using one of the two methods mentioned in this SO post by Matt Dowle. In today's meeting I said I preferred setting variables toNULL
at the top of the function (Dowle's option 2), but I see that innew_infections()
I have gone for option 1 instead which usesget()
.
Once 2 -- 4 are done, we shall get a better idea of whether checks pass - I expect they will.
|
Done |
Thanks @bahadzie - I think you might have rebased on an older commit; could you rebase on the current head of |
Co-authored-by: Adam Kucharski <[email protected]>
Use seq_len() for sequence
Hi @pratikunterwegs I've rebased on main. It's currently running the Github Actions. I'll keep you posted. |
Thanks, will take a look tomorrow, checks are passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these inputs @bahadzie. Just looking at @BlackEdder's comments on #125, I think the expectation is for a function like expected_vaccinations(population, vaccination)
, which calculates the expected number of vaccinated individuals in each age group, before/without running an epidemic model. This would only work for a single dose model, so not Vacamole - although perhaps I'm wrong here, and this could be done as
The calculation of an effective vaccination rate should be an internal step, as it determines the transition rates within the model, so it's not expected to an exported function as far as I understand.
@BlackEdder, could you come in and say whether it's necessary to calculate an effective vaccination rate internally? Currently, users pass cohort-specific, per-interval (1 day) vaccination rates (
Co-authored-by: Banky Ahadzie <[email protected]>
Co-authored-by: Banky Ahadzie <[email protected]>
Key commits from this branch copied into #240 and merged into |