Fraud Detection Project

This is the final project for a Big Data lesson (from Master 2 SISE at the Université Lumière Lyon 2) headed by Guillaume METZLER. The aim of this project was to detect and predict fraud given certain features and using machine learning algorithms.

We had over 11 million real transactions from Fichier National des Chèques Irréguliers (FNCI) and the Banque de France.

The original project can be found here in French.

Introduction 📚

Fraud detection is a challenge in machine learning due to the imbalance of classes (fraud vs. non-fraud). We aim to create effective predictive models using appropriate algorithms. We are investigating resampling techniques such as SMOTEEN and Tomek Link before running several machine learning algorithms to analyse the data.

Methods 📊

Resampling techniques: SMOTEEN and Tomek Link algorithms to rebalance the classes and enhance the representation of frauds.
Data analysis: Several machine learning algorithms, including Decision trees, random forests, basic artificial neural networks, autoencoder, XGBoost, balanced random forests, ensemble models, k-Means, logistic regression to detect and predict fraud given certain features.
Models' effectiveness evaluation: Using F1-score, which is relevant in class imbalance problems.

NB: Only Tomek Link, k-Means, logistic regression, and autoencoder algorithms can be found on this repository. The other algorithms are available on the original repository.

Results 📍

The maximum value for the F1-score is about 0.06.

Conclusion 📎

Fraud detection in a context of class imbalance problems remains a significant challenge in machine learning. This project thus highlights the importance of developing more advanced methods to improve the performance of models in such situations.

Authors ✏️

Adrien CASTEX, Célia MAURIN, Annabelle NARSAMA

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
Report.pdf		Report.pdf
SMOTEEN.ipynb		SMOTEEN.ipynb
fraud_detection.ipynb		fraud_detection.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fraud Detection Project

Introduction 📚

Methods 📊

Results 📍

Conclusion 📎

Authors ✏️

About

Releases

Packages

Languages

annarsama/FraudDetection

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection Project

Introduction 📚

Methods 📊

Results 📍

Conclusion 📎

Authors ✏️

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages