Skip to content

donnajharris/cp640_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 

Repository files navigation

CP640 Machine Learning project (Spring 2022)

An independently researched and implemented machine learning project in the Master of Computer Science program at Wilfrid Laurier University.

Overview

I took historical Major League Baseball data from a Kaggle dataset (as determined by the project requirements) and ran some experiments in an attempt to predict future batting success.

Kaggle dataset source: https://www.kaggle.com/datasets/darinhawley/mlb-batting-stats-by-game-19012021 (external link)

The project, as submitted, was too large to be included on GitHub. As a result, the project .zip file (which includes the data files described in the Jupyter notebooks on GitHub) is available for download here (external link).

Project Deliverables

The complete set of project deliverables includes:

======================

After Downloading the Project Zip

Pre-conditions:

  • Local Jupyter Notebook server is installed and running

  • harr2890_project.zip is extracted locally, with structure intact, where notebooks can be run

  • Can create/write to data subfolder within the harr2890_project folder

Running harr2890_project Notebooks:

Please run the Notebook series in sequential, step order.

Step 1 (harr2890_project_step1_data_prep)

  • General data preprocessing; must be run before all other steps

Step 2 (harr2890_project_step2_hof_data_prep)

  • Hall of Fame Approach preprocessing; must be run before Step 3

Step 3 (harr2890_project_step3_hof_modelling)

  • Hall of Fame Approach modelling (selection and evaluation)

Step 4 (harr2890_project_step4_ops_data_prep)

  • OPS Approach preprocessing; must be run before Step 5

Step 5 (harr2890_project_step5_ops_modelling)

  • OPS Approach modelling (selection and evaluation)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published