Luis Rei [email protected] @lmrei http://luisrei.com
Learns a multiclass classifier (OneVsRest) based on word ngrams.
Uses scikit learn. Reads input from TSV files. Text is expected to already be tokenized.
Original paper (binary classifier): Sida Wang and Christopher D. Manning: Baselines and Bigrams: Simple, Good Sentiment and Topic Classification; ACL 2012. http://nlp.stanford.edu/pubs/sidaw12_simple_sentiment.pdf
Based on a work at https://github.com/mesnilgr/nbsvm: Naive Bayes SVM by Grégoire Mesnil
The second version is an initial attempt to use numpy/scipy/sklearn more.
No guarantee that this works or is correctly implemented.
- I modified the code from Mesnil for multiclass because I had neutral sentiment.
- This does not use the MNB + SVM ensemble from the original paper, just the SVM part (same as Mesnil)
Multiclass Naive Bayes SVM (NB-SVM) by Luis Rei is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Based on a work at https://github.com/mesnilgr/nbsvm.