Skip to content

Commit

Permalink
📊 covid: deaths by vax status (#3297)
Browse files Browse the repository at this point in the history
* 📊 covid: deaths by vax status

* snapshot wip

* meadow

* garden

* change entity->country colname

* add title_public

* grapher wip

* infections: add title_public

* add country name to indicator title

* remove import
  • Loading branch information
lucasrodes authored Sep 16, 2024
1 parent d8c5a1d commit e601a70
Show file tree
Hide file tree
Showing 11 changed files with 459 additions and 0 deletions.
11 changes: 11 additions & 0 deletions dag/covid.yml
Original file line number Diff line number Diff line change
Expand Up @@ -268,3 +268,14 @@ steps:
- data://meadow/covid/latest/infections_model
data://grapher/covid/latest/infections_model:
- data://garden/covid/latest/infections_model

# Deaths by vaccination status
data://meadow/covid/latest/deaths_vax_status:
- snapshot://covid/latest/deaths_vax_status_england.csv
- snapshot://covid/latest/deaths_vax_status_us.csv
- snapshot://covid/latest/deaths_vax_status_chile.csv
- snapshot://covid/latest/deaths_vax_status_switzerland.csv
data://garden/covid/latest/deaths_vax_status:
- data://meadow/covid/latest/deaths_vax_status
data://grapher/covid/latest/deaths_vax_status:
- data://garden/covid/latest/deaths_vax_status
120 changes: 120 additions & 0 deletions etl/steps/data/garden/covid/latest/deaths_vax_status.meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# NOTE: To learn more about the fields, hover over their names.
definitions:
common:
description_short: |-
Death rates are calculated as the number of deaths in each group, divided by the total number of people in this group. This is given per 100,000 people.
unit: doses
presentation:
topic_tags:
- COVID-19


# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
dataset:
update_period_days: 0
title: COVID-19, deaths by vaccination status


tables:
us:
common:
description_key:
- The mortality rate for the 'All ages' group is age-standardized to account for the different vaccination rates of older and younger people.
variables:
us_unvaccinated:
title: Death rate (weekly) of unvaccinated people - United States, by age
presentation:
title_public: Death rate (weekly) of unvaccinated people - United States, by age
display:
name: Unvaccinated
us_vaccinated_no_biv_booster:
title: Death rate (weekly) of fully vaccinated people (without bivalent booster) - United States, by age
presentation:
title_public: Death rate (weekly) of fully vaccinated people (without bivalent booster) - United States, by age
display:
name: Vaccinated without bivalent booster
us_vaccinated_with_biv_booster:
title: Death rate (weekly) of fully vaccinated people (with bivalent booster) - United States, by age
presentation:
title_public: Death rate (weekly) of fully vaccinated people (with bivalent booster) - United States, by age
display:
name: Vaccinated with bivalent booster

chile:
common:
description_key:
- The mortality rate for the 'All ages' group is age-standardized to account for the different vaccination rates of older and younger people.
variables:
chile_0_1_dose:
title: Death rate (weekly) of people with 0 or 1 dose - Chile, by age
presentation:
title_public: Death rate (weekly) of people with 0 or 1 dose - Chile, by age
display:
name: 0 or 1 dose
chile_2_doses:
title: Death rate (weekly) of people with 2 doses - Chile, by age
presentation:
title_public: Death rate (weekly) of people with 2 doses - Chile, by age
display:
name: 2 doses
chile_3_doses:
title: Death rate (weekly) of people with 3 doses - Chile, by age
presentation:
title_public: Death rate (weekly) of people with 3 doses - Chile, by age
display:
name: 3 doses
chile_4_doses:
title: Death rate (weekly) of people with 4 doses - Chile, by age
presentation:
title_public: Death rate (weekly) of people with 4 doses - Chile, by age
display:
name: 4 doses

england:
common:
description_key:
- Unvaccinated people have not received any dose.
- Partially-vaccinated people are excluded.
- Fully-vaccinated people have received all doses prescribed by the initial vaccination protocol.
- The mortality rate is age-standardized to account for the different vaccination rates of older and younger people.
variables:
england_unvaccinated:
title: Death rate (monthly) of unvaccinated people - England, by age
presentation:
title_public: Death rate (monthly) of unvaccinated people - England, by age
display:
name: Unvaccinated
england_fully_vaccinated:
title: Death rate (monthly) of fully vaccinated people - England, by age
presentation:
title_public: Death rate (monthly) of fully vaccinated people - England, by age
display:
name: Fully vaccinated

switzerland:
common:
description_key:
- Data coverage includes both Switzerland and Liechtenstein. Unvaccinated people have not received any dose. Partially-vaccinated people are excluded.
- Fully-vaccinated people have received all doses prescribed by the initial vaccination protocol.
- The mortality rate for the 'All ages' group is age-standardized to account for the different vaccination rates of older and younger people.
variables:
swi_unvaccinated:
title: Death rate (weekly) of unvaccinated people - Switzerland, by age
presentation:
title_public: Death rate (weekly) of unvaccinated people - Switzerland, by age
display:
name: Unvaccinated
swi_vaccinated_no_booster:
title: Death rate (weekly) of fully vaccinated people (without booster) - Switzerland, by age
presentation:
title_public: Death rate (weekly) of fully vaccinated people (without booster) - Switzerland, by age
display:
name: Fully vaccinated, no booster
swi_vaccinated_with_booster:
title: Death rate (weekly) of fully vaccinated people (with booster) - Switzerland, by age
presentation:
title_public: Death rate (weekly) of fully vaccinated people (with booster) - Switzerland, by age
display:
name: Fully vaccinated, with booster

28 changes: 28 additions & 0 deletions etl/steps/data/garden/covid/latest/deaths_vax_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""Load a meadow dataset and create a garden dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset.
ds_meadow = paths.load_dataset("deaths_vax_status")

# Read table from meadow dataset.
tables = list(ds_meadow)

#
# Save outputs.
#
# Create a new garden dataset with the same metadata as the meadow dataset.
ds_garden = create_dataset(
dest_dir, tables=tables, check_variables_metadata=True, default_metadata=ds_meadow.metadata
)

# Save changes in the new garden dataset.
ds_garden.save()
8 changes: 8 additions & 0 deletions etl/steps/data/garden/covid/latest/infections_model.meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,23 +41,31 @@ tables:
variables:
icl_infections:
title: Daily new estimated COVID-19 infections (ICL, <<estimate.title()>> estimate)
presentation:
title_public: Daily new estimated COVID-19 infections (ICL, <<estimate.title()>> estimate)
description: |-
<% set model_name = "ICL" %>
{definitions.others.description}
ihme_infections:
title: Daily new estimated COVID-19 infections (IHME, <<estimate.title()>> estimate)
presentation:
title_public: Daily new estimated COVID-19 infections (IHME, <<estimate.title()>> estimate)
description: |-
<% set model_name = "IHME" %>
{definitions.others.description}
lshtm_infections:
title: Daily new estimated COVID-19 infections (LSHTM, <<estimate.title()>> estimate)
presentation:
title_public: Daily new estimated COVID-19 infections (LSHTM, <<estimate.title()>> estimate)
description: |-
<% set model_name = "LSHTM" %>
{definitions.others.description}
yyg_infections:
title: Daily new estimated COVID-19 infections (Youyang Gu, <<estimate.title()>> estimate)
presentation:
title_public: Daily new estimated COVID-19 infections (Youyang Gu, <<estimate.title()>> estimate)
description: |-
<% set model_name = "Youyang Gu" %>
{definitions.others.description}
32 changes: 32 additions & 0 deletions etl/steps/data/grapher/covid/latest/deaths_vax_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
"""Load a garden dataset and create a grapher dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load garden dataset.
ds_garden = paths.load_dataset("deaths_vax_status")

# Read table from garden dataset.
tables = list(ds_garden)

#
# Process data.
#

#
# Save outputs.
#
# Create a new grapher dataset with the same metadata as the garden dataset.
ds_grapher = create_dataset(
dest_dir, tables=tables, check_variables_metadata=True, default_metadata=ds_garden.metadata
)

# Save changes in the new grapher dataset.
ds_grapher.save()
85 changes: 85 additions & 0 deletions etl/steps/data/meadow/covid/latest/deaths_vax_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""Load a snapshot and create a meadow dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Retrieve tables from snapshots
tb_en = paths.read_snap_table("deaths_vax_status_england.csv")
tb_us = paths.read_snap_table("deaths_vax_status_us.csv")
tb_swi = paths.read_snap_table("deaths_vax_status_switzerland.csv")
tb_cl = paths.read_snap_table("deaths_vax_status_chile.csv")

#
# Process data.
#
# US
rename_cols = {
"Entity": "country",
"Day": "date",
"unvaccinated": "us_unvaccinated",
"vaccinated_without": "us_vaccinated_no_biv_booster",
"vaccinated_with": "us_vaccinated_with_biv_booster",
}
tb_us = tb_us.rename(columns=rename_cols)[rename_cols.values()]
tb_us = tb_us.format(["country", "date"], short_name="us")

# England
rename_cols = {
"Entity": "country",
"Day": "date",
"Unvaccinated": "england_unvaccinated",
"Fully vaccinated": "england_fully_vaccinated",
}
tb_en = tb_en.rename(columns=rename_cols)[rename_cols.values()]
tb_en = tb_en.format(["country", "date"], short_name="england")

# Switzerland
rename_cols = {
"Entity": "country",
"Day": "date",
"Unvaccinated": "swi_unvaccinated",
"Fully vaccinated, no booster": "swi_vaccinated_no_booster",
"Fully vaccinated + booster": "swi_vaccinated_with_booster",
}
tb_swi = tb_swi.rename(columns=rename_cols)[rename_cols.values()]
tb_swi = tb_swi.format(["country", "date"], short_name="switzerland")

# Chile
rename_cols = {
"Entity": "country",
"Day": "date",
"0 or 1 dose": "chile_0_1_dose",
"2 doses": "chile_2_doses",
"3 doses": "chile_3_doses",
"4 doses": "chile_4_doses",
}
tb_cl = tb_cl.rename(columns=rename_cols)[rename_cols.values()]
tb_cl = tb_cl.format(["country", "date"], short_name="chile")

# Table list
tables = [
tb_us,
tb_en,
tb_cl,
tb_swi,
]

#
# Save outputs.
#
# Create a new meadow dataset with the same metadata as the snapshot.
ds_meadow = create_dataset(
dest_dir,
tables=tables,
check_variables_metadata=True,
)

# Save changes in the new meadow dataset.
ds_meadow.save()
41 changes: 41 additions & 0 deletions snapshots/covid/latest/deaths_vax_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"""Script to create a snapshot of dataset.
This data was downloaded from Grapher. It had been imported to Grapher before covid-19-data repository was created.
"""

from pathlib import Path

import click

from etl.snapshot import Snapshot

# Version for current snapshot dataset.
SNAPSHOT_VERSION = Path(__file__).parent.name


@click.command()
@click.option("--upload/--skip-upload", default=True, type=bool, help="Upload dataset to Snapshot")
@click.option("--england", default=None, type=str, help="Path to ICL local data file.")
@click.option("--us", default=None, type=str, help="Path to IHME local data file.")
@click.option("--switzerland", default=None, type=str, help="Path to LSHTM local data file.")
@click.option("--chile", default=None, type=str, help="Path to Youyang Gu local data file.")
def main(england: str, us: str, switzerland: str, chile: str, upload: bool) -> None:
estimates = [
("england", england),
("us", us),
("switzerland", switzerland),
("chile", chile),
]
# Create a new snapshots.
for estimate in estimates:
name = estimate[0]
filename = estimate[1]

if filename is not None:
snap = Snapshot(f"covid/{SNAPSHOT_VERSION}/deaths_vax_status_{name}.csv")
# Copy local data file to snapshots data folder, add file to DVC and upload to S3.
snap.create_snapshot(filename=filename, upload=upload)


if __name__ == "__main__":
main()
33 changes: 33 additions & 0 deletions snapshots/covid/latest/deaths_vax_status_chile.csv.dvc
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Learn more at:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
meta:
origin:
# Data product / Snapshot
title: COVID-19, Incidencia de casos según estado de vacunación, grupo de edad, y semana epidemiológica (Chile)
description: |-
Incidence of deaths according to vaccination status, age group, and epidemiological week.

Vaccination status is classified as "Fully vaccinated" for those people who have received two doses and more than 14 days have passed since their second dose, or have received a vaccine from a vaccination protocol that includes only a single dose and more than 28 days have elapsed since inoculation. This variable takes the value "Unvaccinated or not fully vaccinated" if people do not have a complete vaccination schedule.

The mortality rate corresponds to the incidence rate of deaths per 100,000 inhabitants for the age group, corresponding vaccination status and corresponding epidemiological week.

The mortality rate for the "All ages" group is age-standardized by Our World in Data, using single-year age estimates from the 2022 revision of the United Nations World Population Prospects for Chile. Rates for specific age groups are calculated as crude incidence rates.
date_published: "2023"

# Citation
producer: Departamento de Epidemiología, Ministerio de Salud de Chile.
citation_full: |-
Departamento de Epidemiología, Ministerio de Salud de Chile. Accessed via GitHub (https://github.com/MinCiencia/Datos-COVID19). 2023.

# Files
url_main: https://web.archive.org/web/20230408120752/https://github.com/MinCiencia/Datos-COVID19/tree/master/output/producto89
date_accessed: 2024-09-16

# License
license:
name: CC BY 4.0

outs:
- md5: 42f4ead672fe4284bc5d8a59ecfac666
size: 51346
path: deaths_vax_status_chile.csv
Loading

0 comments on commit e601a70

Please sign in to comment.