Skip to content

Deep dive into the differences between common professions in the field of data.

License

Notifications You must be signed in to change notification settings

rstrong341/Jobs-in-Data

Repository files navigation

Jobs in Data

Summary

The goal of this report was to help students & data professionals gain insight into the data job market to make informed job decisions based on locations, salary and job satisfaction. To accomplish this we found data webscrapped from glassdoor and cleaned it into a more manipulatable dataset. After that, we took a close look our data using vizulaization tools such as Matplotlib, Seaborn and Plotly.

How to read

View Data_Jobs_Cleanup.ipynb to view code for the cleaning of data

View graphing-project-1.ipynb to view code for vizualizations

DATA

All data was scraped from Glassdoor and published to Kaggle

Data Analyst: https://www.kaggle.com/andrewmvd/data-analyst-jobs

Data Scientist: https://www.kaggle.com/andrewmvd/data-scientist-jobs

Data Engineer: https://www.kaggle.com/andrewmvd/data-engineer-jobs

Part 1 - Cleaning

Challenges

  1. Combine all three datasets
  2. Use .split to remove unwanted characters ('K', '()', '\n', '$', '-')
  3. Remove outlier values
  4. Convert datatypes
  5. Parsing the job title column to extract the needed job titles for analysis
  6. Remove NANs
  7. Use keywords to consolidate all jobs that had to do with data science/data engineer/data analyst into one universal name

Part 2 - Analysis

After cleanign our data we wanted to take the data and create representative stories that give students a clear understanding of their most pressing questions on the data job market.

Part 3 - Presetation

Here is our presentation link https://docs.google.com/presentation/d/1ftooYn1mEywfzkT4kyeiWyUoOGdNpYsHznh5B2TwwXY/edit?usp=sharing

About

Deep dive into the differences between common professions in the field of data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages