Skip to content

Spam filter for SMS messages using the multinomial naive bayes algorithm

Notifications You must be signed in to change notification settings

jessedeans/Naive_Bayes_Spam_Filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Naive Bayes Based Spam Filter

This repository contains a notebook and datset used to build an SMS spam filte using using the multinomial Naive Bayes algorithm

In machine learning, Naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes theorem with strong (but naive) independence assumptions between the features. In probability theory and statistics, Bayes' theorem (alternatively Bayes law or Bayes rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

To train the algorithm, I'll use a dataset of 5,572 SMS messages that are already classified by humans. The dataset was put together by Tiago A. Almeida and José María Gómez Hidalgo, and it can be downloaded from the The UCI Machine Learning Repository. The data collection process is described in more details on this page, where you can also find some of the authors' papers.

The notebook is based on a guided project from Dataquest, an online Data Science bootcamp. The learning goal of the project was to test understanding of probability, conditional probability and bayes theorem.

About

Spam filter for SMS messages using the multinomial naive bayes algorithm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages