This project implements and evaluates neural recommendation systems for news, with a focus on beyond-accuracy metrics such as diversity and novelty. Using the MIND dataset, it compares traditional and attention-based models, incorporating pre-trained language models (GloVe and BERT), and applies custom regularisation layers to encourage diverse and novel recommendations.
- Custom-built bi-encoder architectures
- Integration of GloVe and fine-tuned BERT embeddings
- Training pipeline with hard-negative sampling
- Regularisation for Intra-List Diversity and Novelty
- Evaluation on both accuracy and beyond-accuracy metrics
- Statistical validation with t-tests and visualisations
Type | Metrics |
---|---|
Accuracy | AUC, MRR, nDCG@5, nDCG@10 |
Beyond-Accuracy | ILD, Surprisal, EPC, coverage |
git clone https://github.com/johannesvc/OptimisingBeyondAccuracy.git
cd OptimisingBeyondAccuracy
pip install -r requirements.txt
Download the MIND-small and/or MIND-large datasets from msnews.github.io, and place them in the .data/
directory.
To ensure reproducibility:
- Use the
embeddings.ipynb
to:- Build the GloVe lookup matrix
- Fine-tune the BERT embeddings (
bert-base-uncased
) on titles
- Save the results to
.npy
format in the.data/
directory
All models are run from notebooks. Simply follow the steps and press run.
After training, results (metrics and hyperparameters) are saved to results.json
. Evaluation scripts are provided in the evaluation.ipynb
notebook. These include:
- Accuracy vs diversity trade-offs
- Metric comparisons across model types
- All plots used in the main report
.
├── baselines.ipynb # Main baselines script
├── bi-bert.ipynb # Documented training script using GloVe embeddings
├── bi-glove.ipynb # Documented training script using BERT embeddings
├── bi-encoder.ipynb # Main training script for hyperparameter tuning
├── CB.ipynb # Content-based filtering baseline script
├── EDA.ipynb # Exporatory Data Analysis and plotting
├── embeddings.ipynb # Fine-tuning BERT + saving embeddings
├── evaluation.ipynb # Evaluation scripts and plotting
├── README.md # This file
├── model.png # Diagram of model
├── recs
│ ├── bi_encoder.py # Main training script
│ ├── data_loader.py # Preprocessing and sequence generation
│ ├── __init__.py
│ ├── metrics.py # Evaluation and plotting
│ ├── subclasses.py # Model architectures and custom layers
│ └── utils.py # Utils for logging
├── requirements.txt
└── results.json # Output results for all models
MIT License.
If you reference or build upon this work, please cite it using the following:
@software{Van_Cauwenberghe_Optimising_Beyond_Accuracy_2025,
author = {Van Cauwenberghe, Johannes},
doi = {10.5281/zenodo.15322092},
license = {MIT},
month = apr,
title = {{Optimising Beyond Accuracy: Tuning for Diversity and Novelty in Attention-based News Recommenders}},
url = {https://github.com/JohannesVC/OptimisingBeyondAccuracy},
version = {1.0},
year = {2025}
}
- MIND dataset by Microsoft Research
- GloVe embeddings by Stanford NLP
- BERT models via HuggingFace
- Inspired by Microsoft Recommenders and Transformers4Rec (Nvidia)