This repository presents an in-depth analysis of airline and airport operations for US flights in 2015, based on data sourced from Kaggle. The project encompasses a full lifecycle of data handling from extraction through visualization, focusing on 5,000,000+ commercial airline flights records.
-
Data Extraction 🔄:
- Utilized a Python script to automatically extract data, ensuring the ability to refresh data for up-to-date analysis. The script manages extensive datasets involving flight operations. 🐍
-
Data Transformation and Cleaning ✨:
- Applied rigorous cleaning processes to enhance data quality by addressing missing values and duplicates.
- Transformed data for analysis suitability, adjusting timestamps, and categorizing reasons for cancellations.
-
Data Loading 📥:
- Efficiently loaded the prepared data into an analytics database, facilitating comprehensive analysis.
-
Data Modeling 📊:
- Created a star schema to facilitate effective querying and data aggregation necessary for the analysis.
-
Data Analysis 🔍:
- Analyzed 5.82 million rows across 40 fields to determine flight volume variations, departure delay percentages and durations, causes of cancellations, and airline reliability.
-
Reporting and Visualization 📈:
- Developed intuitive dashboards and visual representations, using bar charts, line graphs, and pie charts to clearly illustrate trends and outliers in flight operations.
- Flight Volume Variations: Analysis of how overall flight volume varies by month and day of the week.
- Departure Delays: Examination of what percentage of flights experienced a departure delay in 2015, including the average delay time in minutes.
- Seasonal Delay Patterns: Insights into how the percentage of delayed flights varies throughout the year, with a specific focus on flights leaving from Boston (BOS).
- Flight Cancellations: Metrics on how many flights were cancelled in 2015, including the percentage due to weather versus airline/carrier issues.
- Airline Reliability: Evaluation of which airlines are the most and least reliable in terms of on-time departures.
- Airline Dashboard: Offers a comparative analysis of airline operations, highlighting cancellations and delays.
- Airport Dashboard: Showcases performance metrics by airport, with a focus on operational challenges.
- Delay Time Dashboard: Provides a detailed view of delay metrics to identify operational inefficiencies across airlines.
- Report 1: Overview containing all the KPIs about the project, including general statistics, key metrics, and high-level insights.
- Report 2: Airline Report showcasing metrics such as the top 5 airlines with the most flights, average delay times by airline, and other relevant airline-specific KPIs.
- Report 3: Airport Report featuring data such as the top 5 airports by flight volume, flight origin and destination patterns, and other key airport-related metrics.
- Report 4: Delay Time Report visualizing average flight delays, delay trends over time, and delays by different factors such as airline and airport.
Reports were created to effectively communicate the insights and findings from the data analysis.
For more informations, please contact:
- Email: [[email protected]]
- LinkedIn: [https://www.linkedin.com/in/raghad-el-ghobashy/]