Dataset has different variables. In this project, I have improved a NLP project with machine learning. The challenge is to create a model to predict tweets sentiments.
There are different formats and words that I should not want to add into my model. For instance, I want to remove stop words or numbers, such as ?,!,3,5. Also, I want to extract their origin because words can be varied via prefix or suffix. Therefore I have used stemming.
Word Cloud is the most efficient way to visualize nlp projects. Hence, I extracted most used 20 words. During the deployement, I have made it for the each label.
There are many model that can be used. Therefore, I have written a function which applies models for my data. Then I took the one which shows highest most accurate one.
I dumped the model via joblib.
I designed simple and understandable pages for users. There are three different pages that shows three most important parts of the project.