-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional inference howto posts #57
Conversation
Thanks for the contribution! I'll rebase this PR and force-push some commits to see how the checks react |
b9abb8f
to
047ba67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
The reproduction_number_cluster_size.qmd
and reproduction_number_serological_data.qmd
work fine! However, the estimate_infections.qmd
is not running, which limited further review. This seems to be due to reading the URL for the UK COVID dashboard API, which may be blocking the building of the whole website.
Could we verify if this is still working for you? If this needs preliminary configurations to run, should we try an alternative dataset or reading approach?
# Define URL for the UK COVID dashboard API | ||
uk_dashboard_url <- "https://coronavirus.data.gov.uk/api/v1/data?filters=areaType=overview;areaName=United%2520Kingdom&structure=%7B%22areaType%22:%22areaType%22,%22areaName%22:%22areaName%22,%22areaCode%22:%22areaCode%22,%22date%22:%22date%22,%22newDailyNsoDeathsByDeathDate%22:%22newDailyNsoDeathsByDeathDate%22,%22cumDailyNsoDeathsByDeathDate%22:%22cumDailyNsoDeathsByDeathDate%22%7D&format=csv" | ||
|
||
# Load data from the UK COVID dashboard API and format as data.frame | ||
load_uk_data_text <- GET(uk_dashboard_url) | ||
uk_data <- read.csv(text = content(load_uk_data_text, "text"), stringsAsFactors = FALSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently get this output from uk_data
object. Could you verify if you still get a data frame output?
> uk_data
[1] X..DOCTYPE.html..html.lang.en.class.govuk.template.__..(...)
[2] queries...state..data..success.true
[3] data...status.Green
[4] geography_name.North.East
[5] geography_code.E12000001
[6] refresh_date.null.
[7] X.status.Green
[8] geography_name.North.West
[9] geography_code.E12000002
[10] refresh_date.null..1
[11] X.status.Green.1
[12] geography_name.Yorkshire.and.The.Humber
[13] geography_code.E12000003
[14] refresh_date.null..2
[15] X.status.Green.2
[16] geography_name.East.Midlands
[17] geography_code.E12000004
[18] refresh_date.null..3
[19] X.status.Green.3
[20] geography_name.West.Midlands
[21] geography_code.E12000005
[22] refresh_date.null..4
[23] X.status.Green.4
[24] geography_name.East.of.England
[25] geography_code.E12000006
[26] refresh_date.null..5
[27] X.status.Green.5
[28] geography_name.London
[29] geography_code.E12000007
[30] refresh_date.null..6
[31] X.status.Green.6
[32] geography_name.South.East
[33] geography_code.E12000008
[34] refresh_date.null..7
[35] X.status.Green.7
[36] geography_name.South.West
[37] geography_code.E12000009
[38] refresh_date.null...
[39] dataUpdateCount.1
[40] dataUpdatedAt.1717579083479
[41] error.null
[42] errorUpdateCount.0
[43] errorUpdatedAt.0
[44] fetchFailureCount.0
[45] fetchFailureReason.null
[46] fetchMeta.null
[47] isInvalidated.false
[48] status.success
[49] fetchStatus.idle.
[50] queryKey..weather.alert.list
[51] heat.
[52] queryHash...weather.alert.list...heat..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to prioritize reusing existing data, similar data may be available in {incidence2} and {cfr}
incidence2::covidregionaldataUK
contains one-year data for the UK, and cfr::covid_data
contains three-year data for 19 countries, including the UK.
library(tidyverse)
incidence2::covidregionaldataUK %>%
# preprocess missing values
tidyr::replace_na(list(deaths_new = 0)) %>%
# compute the daily incidence
incidence2::incidence(
date_index = "date",
counts = "deaths_new",
count_values_to = "confirm",
date_names_to = "date"
) %>%
# complete range of dates
incidence2::complete_dates()
#> # incidence: 490 x 3
#> # count vars: deaths_new
#> date count_variable confirm
#> <date> <fct> <dbl>
#> 1 2020-01-30 deaths_new 0
#> 2 2020-01-31 deaths_new 0
#> 3 2020-02-01 deaths_new 0
#> 4 2020-02-02 deaths_new 0
#> 5 2020-02-03 deaths_new 0
#> 6 2020-02-04 deaths_new 0
#> 7 2020-02-05 deaths_new 0
#> 8 2020-02-06 deaths_new 0
#> 9 2020-02-07 deaths_new 0
#> 10 2020-02-08 deaths_new 0
#> # ℹ 480 more rows
cfr::covid_data %>%
# compute the daily incidence
incidence2::incidence(
date_index = "date",
counts = "deaths",
groups = "country"
) %>%
# complete range of dates
incidence2::complete_dates()
#> # incidence: 20,786 x 4
#> # count vars: deaths
#> # groups: country
#> date_index country count_variable count
#> <date> <chr> <fct> <dbl>
#> 1 2020-01-03 Argentina deaths 0
#> 2 2020-01-03 Brazil deaths 0
#> 3 2020-01-03 Colombia deaths 0
#> 4 2020-01-03 France deaths 0
#> 5 2020-01-03 Germany deaths 0
#> 6 2020-01-03 India deaths 0
#> 7 2020-01-03 Indonesia deaths 0
#> 8 2020-01-03 Iran deaths 0
#> 9 2020-01-03 Italy deaths 0
#> 10 2020-01-03 Mexico deaths 0
#> # ℹ 20,776 more rows
Created on 2024-06-06 with reprex v2.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avallecam - Cool. FWIW - In version 2.3.0 I've also added an argument to complete_dates
within the function call itself. If useful you can also use named vectors of the form c(new_name = "old_name")
to rename on the fly. E.g. for the first example you could do
incidence2::covidregionaldataUK %>%
# preprocess missing values
tidyr::replace_na(list(deaths_new = 0)) %>%
# compute the daily incidence renaming deaths_new and completing dates
incidence2::incidence(
date_index = "date",
counts = c(deaths = "deaths_new"),
count_values_to = "confirm",
date_names_to = "date",
complete_dates = TRUE
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elegant! thanks for sharing the new features
|
||
## Steps in detail | ||
- *Example 1* aims to reconstruct SARS-CoV-2 infection dynamics in the UK from daily data on deaths | ||
- First, we load daily data on reported COVID deaths from the [UK COVID dashboard](https://coronavirus.data.gov.uk/details/deaths?areaType=overview&areaName=United%20Kingdom) by copying the 'download data as csv' link, then using the `httr` package to import into R and format as a `data.frame`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I try to access this links, I get immediately redirected to https://ukhsa-dashboard.data.gov.uk/
Should we try an alternative way? or provide more details on how to access it using the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this first review, I propose edits to estimate_infections.qmd
to make this able to run. Some additional edits not added in this review are:
- Edit also the index.qmd to make this one listed on the main page.
- Use {lintr} to adhere to the tidyverse style guides, even though we do not have the workflow in this repo. This will add to the content readability.
- Add the chunk corresponding to get one copy and paste chunk in each entry #51 for single chunks ready to copy and paste by users
If you prefer, for best use of our time, I can take charge of this PR, make the edits, and push them directly as commits. (I needed to make them locally to proof that they work)
One suggestion change I want to emphasize is the plotting. One alternative is adding layers from the EpiNow2 outputs. If not, we can use ggplot2 for a custom plot from the output. Any strong opinions about this?
Thanks for those comments! I've committed the edits. I haven't yet added the following, as wasn't sure of most efficient way:
Feel free to add commits on these, then can merge! |
Nice! about your questions:
We can do this by installing the {lintr} package, and if using Rstudio, we can run an Rstudio addin called "Lint current file" displaying the Rstudio Markers pane with edit suggestions.
Yes, sorry the link was not specific. I meant this 4-lines chunk after the yaml
Noted. We'll do this after next week |
Inferring infection dynamics + inferring R and k from clusters
Following Kucharski et al, PLOS Path, 2014 approach
Also remove WIP social contact file from this branch
Inferring infection dynamics + inferring R and k from clusters
Following Kucharski et al, PLOS Path, 2014 approach
Also remove WIP social contact file from this branch
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Co-authored-by: Andree Valle Campos <[email protected]>
Add reconstruct transmission
0609e43
to
d9f86f5
Compare
this is ready to merge. pending change is to update renv.lock to latest {epiparameter} version. but keeping this to do after merging this PR. now in issue #75 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requested changes added. format now match required format output. The entry reproduction_number_serological_data.qmd
may require some object renaming in data_summary
and summary_data
to clearly differentiate between the observed and estimated values. But we can do this in a separate PR too.
NEWS.md
– not available in repo?This is a PR with 3 new howto guides:
reproduction_number_cluster_size.qmd
estimate_infections.qmd
(can also create PR to EpiNow2 to add this as a case study link on their page once merged here)reproduction_number_serological_data.qmd