Skip to content

Build employee Database with PostgreSQL by using data modeling, engineering and analysis

Notifications You must be signed in to change notification settings

susiexia/Pewlett-Hackard-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pewlett-Hackard-Analysis

Build Employee Database with PostgreSQL and perform SQL queries to explore and analysis data by applying skills of Data Modeling, Data Engineer and Data Analysis.

Challenge

Object

Create a list of candidates for the mentorship program.

  1. the ERD demonstrates relationships between original 6 tables: EmployeeDB.png

  2. Determining the number of individuals retiring:

Queries


SELECT emp_no, birth_date, first_name, last_name, genger AS gender, hire_date
INTO retirement_info
FROM employees
WHERE (birth_date BETWEEN '1952-01-01' AND '1955-12-31')
AND (hire_date BETWEEN '1985-01-01' AND '1988-12-31');

In conclusion, There are 41,380 records of individuals ready to retirement

  1. Determining the number of individuals being hired:

-Current Retirement Eligibility:

Queries

SELECT r.emp_no, r.first_name, r.last_name,d.dep_no, d.to_date
INTO current_emp
FROM retirement_info AS r
LEFT JOIN dept_emp AS d
ON r.emp_no = d.emp_no
WHERE d.to_date = '9999-01-01';

In conclusion, there are 33,118 records of Current Retirement Eligibility

Queries

SELECT ce.emp_no AS Employee_number,ce.first_name, ce.last_name, 
    t.title AS Title, t.from_date, s.salary AS Salary
INTO challenge_emp_info
FROM current_emp AS ce
INNER JOIN titles AS t ON ce.emp_no = t.emp_no
INNER JOIN salaries AS s ON ce.emp_no = s.emp_no;

  1. Each employee ONLY display the most recent title:
  • By using partition by and row_number() function.

Queries

SELECT employee_number, first_name, last_name, title, from_date, salary
INTO current_title_info
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY (cei.employee_number, cei.first_name, cei.last_name)
                ORDER BY cei.from_date DESC) AS emp_row_number
      FROM challenge_emp_info AS cei) AS unique_employee	  
WHERE emp_row_number =1;
  1. The Frequency count of employee titles: challenge_title_info.csv

Queries

SELECT *, count(ct.Employee_number) 
		OVER (PARTITION BY ct.title ORDER BY ct.from_date DESC) AS emp_count
INTO challenge_title_info
FROM current_title_info AS ct;

a summary count of employees for each title:challenge_title_count_info.csv

Queries

SELECT COUNT(employee_number), title
FROM challenge_title_info
GROUP BY title;

Conclusion In the 33118 records of Current Retirement Eligibility, there are 251 Assistant Engineers, 2711 engineers, two managers 2022 staffs,12872 Senior Staffs and 1609 Technique Leaders

  1. Determining the number of individuals available for mentorship role: SQL for eligible for mentor program, challenge_mentor_info.csv

Queries

SELECT em.emp_no,em.first_name, em.last_name, 
    t.title AS Title, t.from_date, t.to_date
INTO challenge_mentor_info
FROM Employees AS em
INNER JOIN titles AS t ON em.emp_no = t.emp_no
WHERE (em.birth_date BETWEEN '1965-01-01' AND '1965-12-31')
AND (t.to_date = '9999-01-01');

In conclusion, there are 1549 active employees eligible for mentor plan.

Limitation and Suggestion

  1. This project assumed retirement years between 1952 and 1955. We need to narrow down period into 3 single year for more accurate estimate and better analysis of potential job opening.

  2. More detail information and analysis are needed for potential mentor table, to compare with the title table of current ready-to-retirement and get a better estimate of outside hiring request.

About

Build employee Database with PostgreSQL by using data modeling, engineering and analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published