This repo contains an application of some regression classifiers, in order to learn a model for comapny headlines sentiment analysis. There are two .py files :
1- main.py : the steps from laoding data, preprocessing the training and test sets, applying the regression classifier on the first and predicting the sentiment score on the second set are contained in this file.
2- Word2VecUtility.py : Contains different functions that preprocess the headlines and output a list of words
- run main.py for a test
python 3
NLTK
download punkt model from nltk
download stopwords corpora from nltk.
you can download them by typing nlkt.download() in your python3 console
- Test other bag of words models
- Test other supervised learning (linear, non linear) classifiers
- Evaluate the results (accuracy, recall and so on ...)
- Conclude the best accuracy (and the why!) https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/