Repository for the Data Science learning track to host assignments.
- Readings (The Unix Shell)
-
Readings
-
Signup for Hackerrank and complete the following Hackerrank challenges.
-
Setup & configure git
- Create a GitHub account.
- Create an SSH key.
- Link your SSH key to your GitHub account
- Verify your SSH connection.
- Readings (Plotting and Programming in Python)
-
Readings
- Loops
- Functions
- Modulo documentation
- Supplemental (not required): Modulo video. You'll need to use the modulo operator for one of your homeworks. If the documentation below isn't enough to help you solve the challenge, see the video as well.
- List comprehensions
- Supplemental (not required): more on list comprehensions
- Lambda functions & map, filter, reduce
-
Cheat Sheets
-
Hackerrank Python
- Loops
- List comprehensions
- The Problem statement and Tutorial should be sufficient to guide you through this exercise, but the example code using list comprehensions on the Problem page is difficult to read. Here it is line by line (and in Python3):
Code using list comprehensions:
x = int( raw_input() )
y = int( raw_input() )
n = int( raw_input() )
print( [ [i, j] for i in range(x + 1) for j in range(y + 1) if( ( i + j ) != n )] )
- The Problem statement and Tutorial should be sufficient to guide you through this exercise, but the example code using list comprehensions on the Problem page is difficult to read. Here it is line by line (and in Python3):
Code using list comprehensions:
- Your first Python function
- Bonus: Try testing your code against custom input (your name!).
- More functions
- Bonus (not required): Nested lists
- Finish working through the Intro to Python 2 notebook
-
Readings
- Type conversion
- Numpy overview
- Note: You won't need to install numpy (it should be included with your Anaconda installation).
-
Jupyter notebooks
- Tutorial: Computing with NumPy arrays. You can download the data and execute the code in this notebook yourself, or just read through the executed version. Make sure you understand all the
numpy
methods.
- Tutorial: Computing with NumPy arrays. You can download the data and execute the code in this notebook yourself, or just read through the executed version. Make sure you understand all the
-
Cheat Sheets
-
Hackerrank
- Finish working through the Numerical computing notebook and the associated readings (linked in the notebook).
-
Readings
- Pandas DataFrames
- Time Series tutorial with Pandas
- Matplotlib tutorial
- Notebooks to read through:
- Pulling data & assembling Pandas DataFrame - Chipotle dataset
- Exploratory analysis with
matplotlib
- Retail dataset
-
Coding challenges:
- Notebooks with fill-in-the-blank code blocks - will be posted by 12p Friday, Feb 15.
- Reading in & assembling Pandas Data Frames - Occupations dataset
- Executed but no solutions.
- Plotting practice - Titanic dataset
- Executed but no solutions.
- Reading in & assembling Pandas Data Frames - Occupations dataset
- Hackerrank
- Notebooks with fill-in-the-blank code blocks - will be posted by 12p Friday, Feb 15.
- Work through ONE notebooks:
- More plotting practice - tips notebook
-
Readings
-
Cheat Sheets
-
Hackerrank
-
Jupyter Notebooks
- Read through and run commands for following tutorials:
-
Reading
- ThinkStats
- Chapters 3-5. Don't worry about the
thinkplot
code or the exercises at the end of each chapter. Focus on the content! The goal is to be familiar with different types of distributions.
- Chapters 3-5. Don't worry about the
- ThinkStats
-
Watch
- Videos on normal distributions.
- Work through practice exercises.
- Understand the review page.
- Videos on normal distributions.
-
Notebooks
- Download and work through the following 3 notebooks. Make sure you understand the concepts as well as the python code. You should complete the exercises throughout.
- Warm-up.
- Baisc Metrics
- Distributions
- Directory with datasets for the above 3 notebooks. You should download these data to the same directory where the notebooks are. You will need to provide the correct path to your data in each Jupyter Notebook.
- Download and work through the following 3 notebooks. Make sure you understand the concepts as well as the python code. You should complete the exercises throughout.
-
Readings
- Groupby documentation - this entire page is really good, but read at least the first two sections ("Splitting an object into groups" and "Aggregation," up to the "Transformation" section).
- Relevant to class Wednesday: attributes for GroupBy objcets (e.g. how we found you can call
head
on a GroupBy object). - For in-class assignment: Iterating through groups.
- Relevant to class Wednesday: attributes for GroupBy objcets (e.g. how we found you can call
- Groupby documentation - this entire page is really good, but read at least the first two sections ("Splitting an object into groups" and "Aggregation," up to the "Transformation" section).
-
Complete blank code blocks in groupby notebook.
-
Readings
- ThinkStats
- Chapters 6-8. Don't worry about the
thinkplot
code or the exercises at the end of each chapter. Focus on the content!
- Chapters 6-8. Don't worry about the
- ThinkStats
-
Kahn Academy
- Probabilities
- Take this quiz. If you have trouble with the answers, please watch the videos for the basic probability section.
- Videos, articles & quizzes for Experimental probability.
- Videos, articles & quizzes for Randomness, probability & simulation.
- Hypothesis testing
- Videos & quizzes for Significance tests.
- Scatterplots & correlations
- Videos, articles & quizzes for Introduction to scatterplots.
- Videos, articles & quizzes for Correlation
- Probabilities
-
Notebooks
- Readings & video
-
- Enroll in Coursera's Machine Learning Cohort that starts on March 18th: Coursera Machine Learning You will be able to access lecture material today.
- Finish all Readings, Videos, and Quizzes for the following sections in Week 1: Introduction:
-
Google Machine Learning Crash Course
- Enroll in Google's Machine Learning Crash course. The material is available antyime. Google ML Crash Course
- Finish all Videos, Readings, Key Terminology, and Check Your Understanding for the following sections:
-
Introduction to Machine Learning with Python
- Clone or download (click the green
Clone or Download
button) the entire Introduction to Machine Learning with Python GitHub Repo. Ensure that all appropriate packages are installed on your computer before class Wednesday March 20th. We will be following through these notebooks as our In-Class Assignments.
- Clone or download (click the green
-
- Solve the following HackerRank Linear Algebra Problems:
- If you have trouble with any of the problems above, check out some of the review material on Khan Academy:
- Introduction to Machine Learning with Python
- Work through Introduction to ML Notebook
- Bootstrapping Followup
- Work through Bootstrapping Notebook
-
Google Machine Learning Crash Course
- If you didn't complete all items in Week 9 (including Playground Exercises) in Introduction to ML, Framing, Descending into ML, or Reducing Loss, go back and finish all items.
- Finish all Videos, Readings, Key Terminology, Playground Exercises, Programming Assignments, and Check Your Understanding for the following sections.
- First Steps with TF: There's are 3 Programming Assignments with this section which will run on Google's Colaboratory platform. These are very similar to Jupyter notebooks, but will not run locally.
- Quick Introduction to pandas
- First Steps with TensorFlow
- Synthetic Features and Outliers
- Generalization
- Training and Test Sets
- First Steps with TF: There's are 3 Programming Assignments with this section which will run on Google's Colaboratory platform. These are very similar to Jupyter notebooks, but will not run locally.
- Introduction to Machine Learning with Python
- Work through Supervised Learning Notebook through only Section: Linear regression aka ordinary least squares This is a very long notebook and we will be working through the ML algorithms week-by-week.
-
- Finish all Readings, Videos, and Quizzes for the following sections in Week 3:
-
Google Machine Learning Crash Course
- Finish all Videos, Readings, Key Terminology, Playground Exercises, and Check Your Understanding for the following sections:
-
- Work through Bias, Variance, Cross-Validation Notebook
- Harvard CS109
- Work through Sklearn, Regression, PCA Notebook
-
- Finish all Readings, Videos, and Quizzes for the following sections in Week 4:
-
Google Machine Learning Crash Course
- Finish all Videos, Readings, Key Terminology, Playground Exercises, and Check Your Understanding for the following sections:
- Work through Dense NN MINST Notebook
-
- Finish all Readings, Videos, and Quizzes for the following sections in Week 5:
-
Google Machine Learning Crash Course
- Finish all Videos, Readings, Key Terminology, Playground Exercises, and Check Your Understanding for the following sections:
- No In-Class Assignment due. Make sure to be caught up
- For those of you interested in learning more in-depth material about Neural Networks, we highly recommend you to complete the Deep Learning Specialization. This is a 5 course series from Coursera which deals with implementing a set of state-of-the-art Neural Networks. This is well beyond the scope of CoderGirl- Data Science, but we wanted to keep this here as a reference.
-
Google Machine Learning Crash Course
- Finish all Videos, Readings, Key Terminology, Playground Exercises, and Check Your Understanding for the following sections:
-
Introduction to Convolutional Neural Networks
- Work through Fashion CNN Notebook
- Use Keras Cheat Sheet if needed
- Coursera Machine Learning
- Finish all Readings, Videos, and Quizzes for the following sections in Week 6:
- Work through notebook on Parameter Selection, Validation & Testing
- Read the following post on Decision Trees
- Read the following post on Decision Trees and Random Forest
- Read the following post on Random Forest, AdaBoost, and Gradient Boosted Trees
- Read the following post on AdaBoost
- Read the following post on Gradient Boosting
- Coursera Machine Learning
- Finish all Readings, Videos, and Quizzes for the following sections in Week 8:
- Google Machine Learning Crash Course
- Finish all Videos, Readings, Key Terminology, Playground Exercises, and Check Your Understanding for the following sections:
- Overview
- Clustering Workflow
- Create a Manual Similarity Measure section is optional
- Summary
- Finish all Videos, Readings, Key Terminology, Playground Exercises, and Check Your Understanding for the following sections:
- k-means notebook up through Example 1. Stop at Example 2.
- Coursera Machine Learning
- Finish all Readings, Videos, and Quizzes for the following sections in Week 8:
- Read survey on Dimensionality Reduction Techniques
- Work through PCA step-by-step
- Git lesson on Software Carpentry.
- Work through the first 10 lessons.
- Additional resources (not homework):
- Read following material on Scientific Presentation
- Perform Explortatory Data Analysis (EDA) on Heart Disease Kaggle Project
- Post the link to your GitHub repo for Mini-Project I: EDA
- Post the link to your GitHub repo for Mini-Project II: Modeling
- Post the link to your GitHub repo for Mini-Project III: Presentation