Skip to content

Latest commit

 

History

History
59 lines (46 loc) · 3.25 KB

README.md

File metadata and controls

59 lines (46 loc) · 3.25 KB

DELVE COVID-19 Dataset

⚠️ This dataset is no longer being maintained. Please see our data sources instead.

This repository provides a data set for COVID-19 research consolidated from multiple sources. The dataset is available as CSV which can be loaded into most environments. We also provide Python code for accessing underlying datasets which in some cases provide more detail or finer resolution.

Reading the dataset

Download the CSV from the dataset directory and load it in your favourite analysis tool.

In Python you can load the CSV directly using Pandas:

import pandas as pd
data_df = pd.read_csv('https://raw.githubusercontent.com/rs-delve/covid19_datasets/master/dataset/combined_dataset_latest.csv', parse_dates=['DATE'])

Or in R:

X = read.csv(url("https://raw.githubusercontent.com/rs-delve/covid19_datasets/master/dataset/combined_dataset_latest.csv")) 

Examples

We provide two Jupyter notebooks with examples:

Codebook

View the Codebook for details of the fields available in the dataset.

Licence

This software is published under the MIT licence. The data generated are available under the Creative Commons Attribution 4.0 International License.

Citation

We recommend citing the combined dataset as follows, noting the importance of including an access date, since the data may be retroactively updated over time.

@misc{DelveCovidDataset,
    title = {DELVE Global COVID-19 Dataset},
    howpublished= {\url{https://github.com/rs-delve/covid19_datasets/blob/master/dataset/combined_dataset_latest.csv}},
    note = {Accessed: <DATE ACCESSED>}
} 

We also recommend citing the original sources of any fields you use, these sources can be found in the Codebook.

Data sources

A full description of data sources, links to their documentation and update frequencies is available here.