Hello there
This project is a datacamp practice project that I have done while studying the data scientist with R track :)
In December 2019, COVID-19 coronavirus was first identified in the Wuhan region of China. By March 11, 2020, the World Health Organization (WHO) categorized the COVID-19 outbreak as a pandemic. A lot has happened in the months in between with major outbreaks in Iran, South Korea, and Italy. We know that COVID-19 spreads through respiratory droplets, such as through coughing, sneezing, or speaking. But, how quickly did the virus spread across the globe? And, can we see any effect from country-wide policies, like shutdowns and quarantines? Fortunately, organizations around the world have been collecting data so that governments can monitor and learn from this pandemic.
1.Visualize covid-19 data from the first several weeks of the outbreak to see at what point this virus became a global pandemic.
2.Analyze covid-19 cases throughout the world , while analyzing how the spread of the virus throught the world is compared to china.
3.Analyze events that have happened during the covid-19 outbreak and their impact on covid-19 cases.
- Setting up our working environment: Github repository & Jupyter Notebook
- Data Understanding:
- Understanding the different datasets we have.
- Analyzing the data while creating different visualizations to compare confirmed covid-19 cases in china vs the rest of the world to identify how the virus is spread throught the world and which events have impacted the outbreak.
- Data Exploration:
- understanding the information , each dataset contains and how this information can be used to create insightful visualizations.
- Analysis and Visualization in Jupyter Notebook.
In February, the majority of cases were in China. That changed in March when it really became a global outbreak: around March 14, the total number of cases outside China overtook the cases inside China. This was days after the WHO declared a pandemic.
There were a couple of other landmark events that happened during the outbreak. For example, the huge jump in the China line on February 13, 2020 wasn't just a bad day regarding the outbreak; China changed the way it reported figures on that day (CT scans were accepted as evidence for COVID-19, rather than only lab tests).
By annotating events like this, we can better interpret changes in the plot.
From the plot below, the growth rate in China is slower than linear. That's great news because it indicates China has at least somewhat contained the virus in late February and early March.