The project uses data about flight arrivals and delays from the US Department of Transportation.
Since this data was not very suitable for data analysis, we had to change the format of some of the data in the dataset. This entailed:
- Removing duplicate rows
- Removing null columns
- Removing missing values
- Converting data types
Since every data analysis project should start with a set of questions and an objective, these were the questions we wanted to answer:
- What were the major reasons for flight delays in the US?
- Which were the top airports with maximum delays?
- On a yearly basis, how many flights get delayed for each airline in our dataset?
- Which months have the most weather delays?
- Which airports get the most amount of arrivals?
Here are some conclusions that our analysis gave us: