This repository contains supporting material to the Data Mining course being taught in the Fall of 2017 at the Seattle campus. For the most part, the material you will find here is made up of Jupyter Notebooks that will be covered in class, and that illustrate Python implementations of some of the concepts we will discuss. Another set of notebook files will serve as reference for specific homework assignments, and those will be explicitly highlighted as such.
Any additional information you may need with regards to the material distributed here will be provided in class and can also be found on our class website, which you can access here.
Everaldo Aguiar will be your course instructor, and if you come across any questions as you go over the notebook files available here, feel absolutely free to reach out to him directly. (refer to the syllabus for contact information)
A large portion of the code snippets that are provided here were adapted from material that is freely available online. As such, you are strongly encouraged to review these (and any other relevant sources you may find) if you wish to improve your Python skills.
- scikit-learn examples: http://scikit-learn.org/stable/auto_examples/
- Pandas tutorials: http://pandas.pydata.org/pandas-docs/version/0.18.1/tutorials.html
- Data Science notebooks (you can find several other great compiled lists of notebooks like this): https://github.com/donnemartin/data-science-ipython-notebooks
- Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney: https://github.com/wesm/pydata-book
- Materials and IPython notebooks for "Programming Collective Intelligence" By Toby Segaran: https://github.com/cataska/programming-collective-intelligence-code