GitHub - Soumayan-pal01/Disneyland-Reviews-Analysis: The aim of this project is to analyse the reviews given by visitors from different countries of the world using NLP to understand the sentiment of the reviews and classify using Sentiment Analysis metrics like Sentiment Polarity and VADER Polarity. This processed data is then feeded to different classifier models to get trained and predict the sentiment of the test reviews.

PROJECT TITLE - Disneyland Reviews Analysis

GOAL - The aim of this project is to analyse the reviews given by visitors from different countries of the world using NLP to understand the sentiment of the reviews and classify using Sentiment Analysis metrics like Sentiment Polarity and VADER Polarity. This processed data is then feeded to different classifier models to get trained and predict the sentiment of the test reviews.

WHAT HAVE I DONE

Loading datasets
Dealing with Null Values
Preprocessing the 'Review_ID ' column
Removing duplicate labels
Preperocessing the 'Year_Month' column
Visualization of the 'Year' column

Visualization of the 'Month' column

Preprocessing the 'Reviewer_Location' column
Visualization on the 'Reviewer_Location' column

Preprocessing 'Review_Text' column and extracting the useful words
Preprocessing the 'Branch' column
Visualization of the labels in the 'Branch' column
Performing One Hot Encoding on the 'Branches' column
Visualization and data analysis of the 'Rating' column

Creating WordCloud of Rating = 1

Creating WordCloud of Rating = 2

Creating WordCloud of Rating = 3

Creating WordCloud of Rating = 4

Creating WordCloud of Rating = 5

Visualization of Rating with respect to different Branches of Disneyland

Creating WordCloud of 'California' Branch

Creating WordCloud of 'HongKong' Branch

Creating WordCloud of 'Paris' Branch

Visualization of the Correlation between different Branches and Reviewers Location

Creating WordCloud of Positive Sentiments

Creating WordCloud of Nagetive Sentiments

Creating WordCloud of Neutral Sentiments

Finding the Sentiment Polarity of the reviews

Performing Lexicon based approach of Sentiment Analysis using the VADER Polarity

Performing Label Encoding on 'Reviewer Location' , 'Year', 'S_Polarity' and 'V_Polarity' columns
Converting the data of 'Month' column into numeric type
Review Analysis on the basis of Sentiment Polarity
Splitting the data
Using Tf-IDF Vectorizer
Using Decision Tree classifier
Using Random Forest classifier
Using XGBoost classifier
Performing all these Analyis steps again with VADER Polarity

MODELS USED

XGBoost - Extreme Gradient Boost alsorithm is based on the Gradient Boosting model which uses the boosting technique of ensemble learning where the underfitted data of the weak learners are passed on to the strong learners to increase the strength and accuracy of the model.
Decision Tree - This algorithm works on the basis of creating tree structures to take decisions
Random Forest - This algorithm works on the concept of emsemble learning.It used bagging technique to train multiple predictors on the same sampled instances to achieve a higher degree of accuracy.

LIBRARIES NEEDED

numpy
pandas
matplotlib
seaborn
scikit-learn
nltk
wordcloud
PIL
string
re
scikit learn
xgboost

Conclusion

After performing the comparative analysis of different classfier models(Decision Tree,Random Forest, XGBoost), we can conclude that :-

VADER Polarity is a better metric than Sentiment Polarity to analyse the sentiment of the extracted review texts
XGBoost perfroms better than the other 2 models both when Sentiment Polarity and VADER Polarity is feeded. However it gives a better Train Accuracy(100%) and Test Accuracy(93%) when trained with VADER polarity

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dataset		Dataset
Images		Images
Model		Model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Soumayan-pal01/Disneyland-Reviews-Analysis

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages