Jobs in Data

Summary

The goal of this report was to help students & data professionals gain insight into the data job market to make informed job decisions based on locations, salary and job satisfaction. To accomplish this we found data webscrapped from glassdoor and cleaned it into a more manipulatable dataset. After that, we took a close look our data using vizulaization tools such as Matplotlib, Seaborn and Plotly.

How to read

View Data_Jobs_Cleanup.ipynb to view code for the cleaning of data

View graphing-project-1.ipynb to view code for vizualizations

DATA

All data was scraped from Glassdoor and published to Kaggle

Data Analyst: https://www.kaggle.com/andrewmvd/data-analyst-jobs

Data Scientist: https://www.kaggle.com/andrewmvd/data-scientist-jobs

Data Engineer: https://www.kaggle.com/andrewmvd/data-engineer-jobs

Part 1 - Cleaning

Challenges

Combine all three datasets
Use .split to remove unwanted characters ('K', '()', '\n', '$', '-')
Remove outlier values
Convert datatypes
Parsing the job title column to extract the needed job titles for analysis
Remove NANs
Use keywords to consolidate all jobs that had to do with data science/data engineer/data analyst into one universal name

Part 2 - Analysis

After cleanign our data we wanted to take the data and create representative stories that give students a clear understanding of their most pressing questions on the data job market.

Part 3 - Presetation

Here is our presentation link https://docs.google.com/presentation/d/1ftooYn1mEywfzkT4kyeiWyUoOGdNpYsHznh5B2TwwXY/edit?usp=sharing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Jobs in Data

Summary

How to read

DATA

Part 1 - Cleaning

Challenges

Part 2 - Analysis

Part 3 - Presetation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Jobs in Data

Summary

How to read

DATA

Part 1 - Cleaning

Challenges

Part 2 - Analysis

Part 3 - Presetation