Skip to content

GWC-DCMB/CapstoneProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CapstoneProject

Step by step goals for completing the Girls Who Code at UM DCMB Capstone Project.

  • At the end of each meeting the latest .ipynb should be uploaded to a Google Drive folder shared with your project group. Keep track of your progress in the notebook via comments, in a Google doc, or in a group message on Slack regarding what you accomplished and what you need to do.
  • You can communicate with your group & facilitator on Slack!

Data Analysis Step by Step

Get organized!

  • One partner should make a Google Drive folder and share it with group mentor and partner(s)
  • Make a new Jupyter Notebook (from an existing notebook, File > New notebook) and move it into the new Drive folder

Read in the data

  • The code below can be used to read in one of the datasets already on GitHub:
import pandas as pd
url = "https://raw.githubusercontent.com/GWC-DCMB/CapstoneProject/master/datasets/"
filepath = "AP_exams/ap_exams_MI_2018.csv"
df = pd.read_csv(url + filepath)
df.head()
  • Start familiarizing yourself with your data. What are the data types in each column of the data set (e.g. float, string)?

Hypothesis generation

  • Refine the question or hypothesis you want to explore in your project
  • Make a plan for what steps you need to take to answer the question
  • Sketch out potential plots including x and y axes (do this on paper with your group)

Data cleaning

  • Start cleaning data programatically. Add commands to your .ipynb.
  • You should be using pandas, check out documentation
  • To help with data frame manipulation in pandas check out this Jupyter Notebook
  • What variables do you need? What outliers should you remove? What variable has too much missing data to be reliable?
  • Remember our example project Jupyter Notebook
  • A list of all the functions/methods/packages you've learned can be found here

Data visualization

Science communication!

About

Capstone Project datasets & instructions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published