Skip to content

russell94paul/salary_estimator_project

Repository files navigation

salary_estimator_project (In-progress)

Resources

https://github.com/arapfaik/scraping-glassdoor-selenium https://shandou.medium.com/export-and-create-conda-environment-with-yml-5de619fe5a2

Project Goals:

Goal: Create a tool that estimates data engineer salaries to aid data engineers negotiate my income when I get a job offer.

Workflow:
Data Collection: Scrape over 1000 job descriptions from glassdoor using python and selenium. Completed
Data Cleaning: Engineer features from the text of each job description to quantify the value companies put on python, excel, aws, and spark. Completed
Exploratory Data Analysis: Use Jupyter notebook and graphing libraries such as matplotlib and seaborn in order to discover main characteristics of the data. Check it out here EDA Notebook Completed
Model Selection: Compare and Evalaute ML Models and choose the model with best performance.Completed
Productionize: Build a client facing API using flask. Completed

Additional Goals:

Cloud Deployment: Migrate this project to google cloud platform.
Scheduling: Automate the whole process using cron or airflow and deploy this project on a cloud platform (AWS or GCP).
Data Visualization: Loading of Data into BigQuery in order to create a dashboard using Google Data Studio

About

Repo for data science salary estimator.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published