Author: Matías Gabriel Flores
This repository contains the code developed for the undergraduate thesis in Computer Engineering at the Universidad Nacional de Avellaneda.
The goal of this project was the implementation of various recommender systems using the Surprise library by Nicolas Hug.
-
Folder
Algorithms
: Contains the implementations of the recommender systems.- Implemented Models:
- Memory-Based:
- Item-Based
- User-Based
- Model-Based:
- SVD_Funk
- PMF
- NMF
- Content-Based:
- ContentKNN
- Random Predictor:
- Normal Predictor
- Memory-Based:
- Executable Files:
KNN_User.py
KNN_Item.py
Random.py
SVD_Funk.py
PMF.py
NMF.py
Content_Based.py
- Implemented Models:
-
Folder
Results
: Contains images of the evaluation results for each algorithm. -
File
Exploratory Data Analysis
: Jupyter Notebook detailing the exploratory analysis of the datasets and visualizations.
- Default Evaluation:
- Accuracy Metrics: RMSE, MSE, MAE, and FCP.
- Generation of a top 10 recommendations for user 500.
- Configurable Options in the
evaluate
Function:ranking
: Enables evaluation based on prediction rankings, such as Precision, Recall, and F1-Score. (Default:True
).features
: Measures additional metrics like Coverage, Diversity, and Novelty. (Default:True
).
- Recommendation: Run evaluations separately if the system has less than 16 GB of RAM.
- Dataset: ml-latest-small from GroupLens.
- Location: Downloadable from the GroupLens website under the datasets section.
- Instructions: Place the downloaded folder (
ml-latest-small
) in the repository root directory.
- GridSearch grids for hyperparameter tuning are included but commented out in the scripts.
- By default, only the best hyperparameter configurations found during testing are used.
(Note: Hyperparameter search is time and resource-intensive). - To enable hyperparameter search: Uncomment the lines labeled "search for best hyperparameters for 'algorithm'", then comment out the subsequent lines where the dictionary is repeated.
- The Exploratory Data Analysis folder includes a Jupyter Notebook showcasing the dataset analysis and the generated visualizations.
- The Results folder contains images of the evaluation outcomes for each algorithm.