Skip to content

This repository houses a comprehensive Machine Learning project aimed at classifying Yelp reviews using Multinomial Naive Bayes and Natural Language Processing (NLP) techniques.

Notifications You must be signed in to change notification settings

pruthvikp/YELP_REVIEWS_CLASSIFICATION_USING_NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

📚 Yelp Reviews Classification with Multinomial Naive Bayes and NLP 🤖

This repository houses a comprehensive Machine Learning project aimed at classifying Yelp reviews using Multinomial Naive Bayes and Natural Language Processing (NLP) techniques. Leveraging the power of scikit-learn and NLTK libraries, this project dives into the realm of sentiment analysis and text classification.

ML project abstract: Yelp, a prominent platform for business reviews, provides invaluable insights into customer experiences across various industries, with restaurants being one of the most reviewed categories. This project aims to harness the power of Natural Language Processing (NLP) and machine learning to analyse Yelp restaurant reviews comprehensively. By leveraging NLP techniques, we can extract meaningful information from textual data, such as sentiment, topics, and key insights, to understand customer preferences and sentiments towards restaurants. The goal is to develop machine learning models capable of predicting star ratings based on review text, providing valuable feedback for both restaurants and Yelp analysts.

Web Interface: We've developed an intuitive web interface using Flask and HTML to bring our machine learning model for predicting star ratings from Yelp restaurant reviews to life. This interface allows users to input their review text into a user-friendly form, which is then processed by our advanced NLP and machine learning algorithms. By analyzing the sentiment, topics, and key insights from the textual data, our model accurately predicts the star rating that the review would likely receive. This functionality not only provides immediate feedback for users curious about the potential rating of their review but also offers valuable insights for restaurant owners looking to understand customer sentiment. Yelp analysts can also utilize this tool to gauge the effectiveness of their platform in capturing genuine customer experiences. Our web interface bridges the gap between sophisticated machine learning techniques and practical, real-world applications, making it accessible and beneficial for a broad audience.

Features:

🔍 Exploratory Data Analysis (EDA): Dive deep into the dataset, understanding the distribution of classes, and exploring key features that impact classification.

🔧 Preprocessing: Implement robust preprocessing steps including tokenization, stopwords removal, and stemming/lemmatization to prepare text data for modeling.

🧠 Modeling: Build, train, and fine-tune a Multinomial Naive Bayes classifier to predict sentiment labels for Yelp reviews.

📊 Evaluation: Rigorous evaluation metrics including accuracy, precision, recall, and F1-score are utilized to assess the performance of the classification model.

📈 Results Interpretation: Analyze model outputs, visualize key metrics, and gain insights into the effectiveness of the classification approach.

Requirements:

  • Python 3.x
  • scikit-learn
  • NLTK
  • Jupyter Notebook (optional, for running the notebooks)
  • Flask
  • Pickle

How to Use:

  1. Clone the repository to your local machine.
  2. Install the required dependencies.
  3. Navigate and explore the files for a detailed walkthrough of the project.
  4. Experiment with different parameters, preprocessing techniques, or even try different classification algorithms for further exploration.

Contributing:

Contributions are welcome! If you have any ideas for improvements or find any issues, feel free to open an issue or submit a pull request.

About

This repository houses a comprehensive Machine Learning project aimed at classifying Yelp reviews using Multinomial Naive Bayes and Natural Language Processing (NLP) techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages