GitHub - DiaaMohsen/sentiment_analysis-on_movie_reviews_kaggle

Sentiment Analysis on Movie Reviews Classify the sentiment of sentences from the Rotten Tomatoes dataset on kaggle

-On test.tsv attached in competition, i got overall accuracy 0.61835

I got overall accuracy: 0.686616258704 on testset extracted from splitting train.tsv precision recall f1-score support NEGATIVE 0.78 0.54 0.63 3011 SOMEWHAT NEGATIVE 0.48 0.61 0.54 6466 NEUTRAL 0.81 0.75 0.78 25868 SOMEWHAT POSITIVE 0.52 0.66 0.58 7755 POSITIVE 0.79 0.58 0.67 3718
```
avg / total       0.71      0.69      0.69     46818
```

I built 3 classifers using [LinearSVC with CountVectorizer] as follows:

Classifier-1: classify {NEG, NEU, POS} on normalized sentiment values: {-1, 0, 1}
Classifier-2: classify {NEG, SMW NEG} on sentiment values: {0, 1}
Classifier-3: classify {SMW POS, POS} on sentiment values: {4, 5}

How does it work?

classify data with classifier-1 if data labeled with -1: classify it with classifier-2 if data labeled with 1: classify it with classifier-3 if data labeled with 0: no thing to run

Which models i used?

ToDO:

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.ipynb_checkpoints		.ipynb_checkpoints
data		data
pickled_files		pickled_files
.gitignore		.gitignore
3_classifiers.ipynb		3_classifiers.ipynb
3classes_sent.csv		3classes_sent.csv
README.md		README.md
utils.py		utils.py
utils.pyc		utils.pyc

Provide feedback