Skip to content

This repository features a Big Data project analyzing a survey of computer science and data science professionals. It includes data cleaning, Power BI dashboards, and Python visualizations to uncover key insights.

Notifications You must be signed in to change notification settings

aziz-zina/PFS_BigData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

End-of-Semester Big Data Project

Overview

This README file provides an overview of the end-of-semester project in Big Data conducted by Aziz Zina. The project focuses on leveraging big data techniques to analyze and visualize insights from a real survey dataset related to individuals working in the domain of computer science and data science.

image

Project Steps

  1. Data Acquisition The project began with the acquisition of an Excel file containing survey data. This survey encompassed various questions aimed at gathering information about professionals in the computer science and data science fields. The dataset served as the foundation for subsequent analysis and visualization.

  2. Data Cleaning To ensure the accuracy and reliability of the analysis, a comprehensive data cleaning process was undertaken. This step involved handling missing values, addressing outliers, and standardizing data formats. The cleaned dataset formed the basis for the subsequent stages of the project.

  3. Data Analysis and Visualization

    • 3.1 Power BI Visualizations Power BI was employed to create interactive and insightful visualizations based on the cleaned dataset. Various charts, graphs, and dashboards were generated to represent key trends, patterns, and relationships within the data. Power BI's capabilities were harnessed to provide a user-friendly and dynamic interface for exploring the survey insights.

    • 3.2 Python Visualizations In addition to Power BI, Python was utilized to perform further data analysis and generate visualizations. Python libraries such as Pandas, Matplotlib, and Seaborn were employed to create additional plots and charts, enhancing the depth of analysis and providing a diverse set of visual representations.

  4. Results and Findings The project culminated in the identification of significant findings and insights derived from the analysis of the survey data. These findings were presented through a combination of Power BI dashboards and Python-generated visualizations, providing a comprehensive understanding of the surveyed population.

Project Files

The project repository includes the following files:

Data Professionals Survey.xlsx: The original Excel file containing the raw survey data. pfs.pbix: Power BI file containing interactive visualizations and Python code for additional visualizations. Data Visualization.odp: Power Point File that contains a small presentation of the project. README.md: This documentation file.

Project Demonstration

For a detailed walkthrough and explanation of the project, please refer to the accompanying YouTube video: Project Demo Video

Instructions for Running the Project

Ensure you have the necessary tools installed, including Power BI and a Python environment with required libraries. Open the Power BI file (pfs.pbix) to explore interactive visualizations.

Acknowledgments

Special thanks to Mr. Riadh Ghlala for guidance and support throughout the duration of the project.

Feel free to reach out for any further clarifications or inquiries.

Happy exploring and analyzing!

About

This repository features a Big Data project analyzing a survey of computer science and data science professionals. It includes data cleaning, Power BI dashboards, and Python visualizations to uncover key insights.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published