This repository contains files related to STATS 202 Data Mining Project.
QUESTION 1: Treatment effect (35 points)
QUESTION 2: Patient segmentation (55 points)
QUESTION 3: Forecasting (55 points)
QUESTION 4: Binary classification (55 points)
datasets.py contains all data transformation functions, modeling functions, feature selection functions, and plotting functions. (over 1000 lines of code).
Finished 1st in Kaggle competition for Forecasting and 4th Classification.