- Change line in sqlscript.py (line 60) db_con = MySQLdb.Connect(, , , 'tweets')
- Run sqlscript.py This will create a database tweets with different tables
- utility/create_stop_words.py is run stand alone to generate stop words using idf
- This script has the training dataset hard coded. Can be changed for a new dataset
- Produces file Lists/StopWordsIDF.txt (don't have to recreate if it exists)
-
Change collector.config. Query terms are also given in this file. Currently query has to be done manually by running Collector.py
-
Once query is done, run sentinal.py (with updated SQL info)
- Stemming
- Pivoting
- Bi-grams
- Confidence of prediction
- Neutral class
- Max entropy classifier
- MI of feature (and feature selection based on this)