Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sourcery refactored main branch #1

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
588ba07
initial commit
RasmusRynell Jul 12, 2021
dc9a596
Update README.md
RasmusRynell Jul 12, 2021
b73a476
Update README.md
RasmusRynell Jul 12, 2021
0fec009
Update README.md
RasmusRynell Jul 12, 2021
88f9693
small update
RasmusRynell Jul 12, 2021
326aad6
Merge branch 'main' of github.com:RasmusRynell/Predicting-NHL
RasmusRynell Jul 12, 2021
c89c482
Update README.md
RasmusRynell Jul 13, 2021
ed7963d
Update README.md
RasmusRynell Jul 13, 2021
bc804fc
Update README.md
RasmusRynell Jul 13, 2021
58dcceb
Started work on preprocessing
RasmusRynell Jul 13, 2021
1b0f5c8
Merge branch 'main' of github.com:RasmusRynell/Predicting-NHL
RasmusRynell Jul 13, 2021
6b15350
did some tests with preprocessing
RasmusRynell Jul 14, 2021
e4c51be
Research
RasmusRynell Jul 15, 2021
98f5e4a
Working on pipeline, soon done
RasmusRynell Jul 16, 2021
9b31ee8
Update README.md
RasmusRynell Jul 19, 2021
2985e20
Added feature manipulation trough settings and defaults
RasmusRynell Jul 19, 2021
5fd69c9
Added support for column transformer
RasmusRynell Jul 21, 2021
583411f
Fixed readability
RasmusRynell Jul 21, 2021
a255f30
added support for cross validation
RasmusRynell Jul 21, 2021
2ea51a0
cleanup
RasmusRynell Jul 22, 2021
c49d269
Simple hyper parameter optimization added
RasmusRynell Jul 22, 2021
3eecce2
started work on research pipeline
RasmusRynell Jul 22, 2021
6beb1c3
Good progress
RasmusRynell Jul 22, 2021
7bb02d7
Tests with different classifiers has begun
RasmusRynell Jul 23, 2021
7302cc9
Added ROC AUC score
Jul 23, 2021
bc20968
done testing models
RasmusRynell Jul 24, 2021
3316538
Redid data generation, scrapped simple models and started work on tes…
RasmusRynell Jul 25, 2021
2adeac5
'Refactored by Sourcery'
Jul 25, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,6 @@ dmypy.json

# Pyre type checker
.pyre/

.history/
.history
75 changes: 61 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,66 @@
# Predicting-NHL
# Predicting NHL stats (shots on goal)

## Overview
This project explores the idea of using different machine learning techniques to determine different stats in NHL games. Research about different techniques to be used has already been done in [this](https://github.com/RasmusRynell/sports_betting_test) project.
This project explores the idea of using different machine learning techniques to determine different stats in NHL games. Research and testing of different techniques has previously mainly been done in [this](https://github.com/RasmusRynell/sports_betting_test) project.

### Current features
- About 5000 predictions for “Shots on goal” in NHL games from different betting sites
- A way to add more predictions by copy-pasting them from sites into text files
### Current features / functionality
- A database containing about 5000 predictions for “Shots on goal” in NHL games from different betting sites
- A way to add more "new" predictions by copy-pasting them from sites into text files
- Request data from NHL’s own database to populate your own internal database (in order to use less internet and have a faster running program)
- Produce a CSV file containing all data related to a prediction, this includes player statistics, players teams statistics, and enemy teams statistics for as many games back as one wants (typically no more than 5)

### Futures features
### Features in the future
- Load and select features from CSV file to be used in different ML techniques (Data preprocessing)
- Create a pipeline for both using and testing and evaluating ML techniques
- Add more data (maybe xG for both player and teams)
- Create a better way of scraping the web both for data
- Create a better way of scraping the web both for previous bets and their outcomes
- Create a better way of scraping the web for data
- Create a better way of scraping the web for previous bets and their outcomes
- Create a user interface either as an app or on the web

### Long term new features
- Create a more robust framework for different sports with different types of data
- Live predictions


<br><br/>
## Inner workings *(Under construction)*
### Data collection
In order to perform different ML techniques, data is needed, for now we use NHL's own (free and very detailed) database to gather all our data. We take this data and store it in our own [sqlite](https://www.sqlite.org/index.html) database to be used and updated when one wants/needs. The reasoning behind having our own database is quite simple, it’s a lot of data to ask for each time we want to do a prediction. This combined with the fact that we in the end want this process to be done once each day in the season the number of times we ask for a specific game in the database gets quickly out of control.

#### NHL's database
In order to use the NHL database [this](https://gitlab.com/dword4/nhlapi/-/blob/master/stats-api.md) incredible documentation was used. Since we know we want all game data for every game x seasons back all we had to do was loop through each day for that period and request all games that occurred on that day. We then took that data and put it in our own database in order to be used later. To update the database is then to only request data for games that have not yet (according to our database) been played, by doing it this way we don’t have to keep requesting data that never changes.

<br><br/>
### Preprocessing

#### Feature selection

### Machine learning
#### Feature extraction

#### Dimensionality reduction

### Evaluating the results
#### Missing data removal / prediction

#### Transformation

#### Discretization


<br><br/>
### Machine learning

<br><br/>
### Evaluating the results

<br><br/>
## Installation
*Side comment:
Make sure you have atleast python 3.9 installed, if for some reason "python3" does not work, try using "python" instead.*
### Installing the source code
<pre>
git clone [email protected]:RasmusRynell/Predicting-NHL.git
</pre>

### Create environment
*Side comment:
If "python3" does not work, try "python" and if it still does not work, download and install python.*

Navigate into to project and create an environment
<pre>
Expand All @@ -57,8 +75,37 @@ On Unix/MacOS:

### Install packages
<pre>python3 -m pip install -r requirements.txt</pre>

<br><br/>
## How to use *(Under construction)*
*Side comment: The application is currently accessed through a terminal, this terminal can then in later builds be replaced by a more traditional and easy to use UI.*

### Starting the application
To first start the application make sure you have followed the instructions under "Installation". When that is done simple navigate to the "app" folder and write the following:
<pre>python3 main.py</pre>
The application is then started, to then do certain things just enter in a command.

### Commands

#### General
* "help (h) *Prints all currently available commands*

* "exit" (e) *Exits the application*

#### Dev
* "eval" (ev) *under construction*

* "und" *Refreshes/Updates the local NHL database*

* "and" *Add nicknames to the database*

* "ubd" *Add "old" bets (from bookies but that's located on a local file) to database*

* "gen" *Generate a CSV file containing all information for a player going back to 2017/09/15*

* "pre" *Preprocess a csv file according to a configuration file*

<br><br/>
## Contributors
- @RasmusRynell
- [RasmusRynell](https://github.com/RasmusRynell)
- [Awarty](https://github.com/Awarty)
8 changes: 8 additions & 0 deletions app/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
*.unibet filter=lfs diff=lfs merge=lfs -text
*.betway filter=lfs diff=lfs merge=lfs -text
*.betsson filter=lfs diff=lfs merge=lfs -text
*.bethard filter=lfs diff=lfs merge=lfs -text
*.ss filter=lfs diff=lfs merge=lfs -text
*.bet365 filter=lfs diff=lfs merge=lfs -text
*.db filter=lfs diff=lfs merge=lfs -text
*.csv filter=lfs diff=lfs merge=lfs -text
3 changes: 3 additions & 0 deletions app/external/databases/bets.db
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/databases/testing.db
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/nicknames/player_nicknames.csv
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/nicknames/team_nicknames.csv
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/player_data/8471214.csv
Git LFS file not shown
1 change: 1 addition & 0 deletions app/external/predictions/test.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"8471214": {"2020020458": {"game_date": "2020-03-16", "odds": [{"bet365": {"3.5": {"Over": "1.71", "Under": "2.0"}}}], "predictions": {}}}}
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-16.bet365
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-17.bet365
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-18.bet365
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-19.bet365
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-20.bet365
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-20.betsson
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-20.betway
Git LFS file not shown
3 changes: 3 additions & 0 deletions app/external/saved_bets/2021-03-20.unibet
Git LFS file not shown
Loading