Skip to content


Repository files navigation


This repository contains code and blog on how to build a Naive-Bayes Classifier(NBC) on text data without using any external modules i.e., to build from scratch in python. By using Naïve Bayes Classifier, given a new sentence we are supposed to find out which type this sentence belongs to using bayes theorem.


The Dataset can be downloaded from Kaggle by clicking here. You need a Kaggle account to download the dataset. This dataset consists corpus of text data related to 6 different categories. They are Responsibility, Requirement, Soft Skill, Experience, Skill and Education.


code is in the form of jupyter notebook. You can run the notebook using the following command:

jupyter notebook

It will open the jupyter notebook in your browser. You can run the code cells in the notebook.


You can read the blog on this project here.

For any queries, feel free to open an issue.