Skip to content

Latest commit

 

History

History
46 lines (31 loc) · 3.26 KB

README.md

File metadata and controls

46 lines (31 loc) · 3.26 KB

End-of-Semester Big Data Project

Overview

This README file provides an overview of the end-of-semester project in Big Data conducted by Aziz Zina. The project focuses on leveraging big data techniques to analyze and visualize insights from a real survey dataset related to individuals working in the domain of computer science and data science.

image

Project Steps

  1. Data Acquisition The project began with the acquisition of an Excel file containing survey data. This survey encompassed various questions aimed at gathering information about professionals in the computer science and data science fields. The dataset served as the foundation for subsequent analysis and visualization.

  2. Data Cleaning To ensure the accuracy and reliability of the analysis, a comprehensive data cleaning process was undertaken. This step involved handling missing values, addressing outliers, and standardizing data formats. The cleaned dataset formed the basis for the subsequent stages of the project.

  3. Data Analysis and Visualization

    • 3.1 Power BI Visualizations Power BI was employed to create interactive and insightful visualizations based on the cleaned dataset. Various charts, graphs, and dashboards were generated to represent key trends, patterns, and relationships within the data. Power BI's capabilities were harnessed to provide a user-friendly and dynamic interface for exploring the survey insights.

    • 3.2 Python Visualizations In addition to Power BI, Python was utilized to perform further data analysis and generate visualizations. Python libraries such as Pandas, Matplotlib, and Seaborn were employed to create additional plots and charts, enhancing the depth of analysis and providing a diverse set of visual representations.

  4. Results and Findings The project culminated in the identification of significant findings and insights derived from the analysis of the survey data. These findings were presented through a combination of Power BI dashboards and Python-generated visualizations, providing a comprehensive understanding of the surveyed population.

Project Files

The project repository includes the following files:

Data Professionals Survey.xlsx: The original Excel file containing the raw survey data. pfs.pbix: Power BI file containing interactive visualizations and Python code for additional visualizations. Data Visualization.odp: Power Point File that contains a small presentation of the project. README.md: This documentation file.

Project Demonstration

For a detailed walkthrough and explanation of the project, please refer to the accompanying YouTube video: Project Demo Video

Instructions for Running the Project

Ensure you have the necessary tools installed, including Power BI and a Python environment with required libraries. Open the Power BI file (pfs.pbix) to explore interactive visualizations.

Acknowledgments

Special thanks to Mr. Riadh Ghlala for guidance and support throughout the duration of the project.

Feel free to reach out for any further clarifications or inquiries.

Happy exploring and analyzing!