- Clone this repo
git clone https://github.com/MichiganDataScienceTeam/googleanalytics.git
-
Download the data from Google Drive and place it in
./data
-
Unzip the data
cd data
unzip train.csv.zip
unzip test.csv.zip
unzip sample_submission.csv.zip
cd ..
- Check to make sure the dataset is in the correct place
python dataset.py --debug
- Run the exploration code. Note: removing the
--debug
flag will cause the full dataset to be loaded, which may take a long time on your machine.
python explore.py --debug
- Create an account on Github and add an SSH key to your account
- Ask @stroud on slack to join the MDST Organization
- Assign yourself to an issue
- Create a branch and write your code
- Submit a pull request when you are done!