Skip to content

Cleaning and analyzing NYC high school SAT data using Python (Pandas, Matplotlib.pyplot)

Notifications You must be signed in to change notification settings

jenfad/Cleaning_Analyzing_NYC_SAT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Cleaning_Analyzing_NYC_SAT

Cleaning and analyzing NYC high school SAT data using Python (Pandas, Matplotlib.pyplot)

The following description was provided by Dataquest:

Over the last three missions, we explored relationships between SAT scores and demographic factors in New York City public schools. For a brief bit of background, the SAT, or Scholastic Aptitude Test, is a test that high school seniors in the U.S. take every year. The SAT has three sections, each of which is worth a maximum of 800 points. Colleges use the SAT to determine which students to admit. High average SAT scores are usually indicative of a good school.

New York City has published data on student SAT scores by high school, along with additional demographic data sets. Over the last three missions, we combined the following data sets into a single, clean pandas dataframe:

  • SAT scores by school - SAT scores for each high school in New York City
  • School attendance - Attendance information for each school in New York City
  • Class size - Information on class size for each school
  • AP test results - Advanced Placement (AP) exam results for each high school (passing an optional AP exam in a particular subject can earn a student college credit in that subject)
  • Graduation outcomes - The percentage of students who graduated, and other outcome information
  • Demographics - Demographic information for each school
  • School survey - Surveys of parents, teachers, and students at each school

New York City has a significant immigrant population and is very diverse, so comparing demographic factors such as race, income, and gender with SAT scores is a good way to determine whether the SAT is a fair test. For example, if certain racial groups consistently perform better on the SAT, we would have some evidence that the SAT is unfair.

In the last mission, we began performing some analysis. We'll extend that analysis in this mission.

About

Cleaning and analyzing NYC high school SAT data using Python (Pandas, Matplotlib.pyplot)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published