In this chapter, we discuss the process of data manipulation, learn how to explore an API to gather data, and perform data cleaning and reshaping with pandas
.
There are five notebooks that we will work through, each numbered according to when they will be used:
1-wide_vs_long.ipynb
: discusses wide versus long format data2-using_the_weather_api.ipynb
: walks through collecting daily temperature data from the NCEI API3-cleaning_data.ipynb
: shows how to perform some initial data cleaning4-reshaping_data.ipynb
: illustrates how to reshape data withpandas
5-handling_data_issues.ipynb
: showcases strategies for dealing with duplicate, missing, or invalid data
All the datasets necessary for the aforementioned notebooks, along with information on them, can be found in the data/
directory. The end-of-chapter exercises will use the datasets in the exercises/
directory; solutions to these exercises can be found in the repository's solutions/ch_03/
directory.