GitHub - bcipolli/mls-aihack-mother-repo: Hackathon project for Machine Learning Society in San Diego

Goal:

Setup & Installation:

pip install -r requirements.txt
python -c "import nltk; nltk.download()"

Then choose to install the "popular" collection.

NLP Demo

This demo shows stemming, lemmatizing, and word counting (including tf-idf)

python nlp_demo.py

Downloading data

Run

python registry_data.py

You can tweak parameters, such as the min # articles per event or api key, within the script.

Modeling

python main.py

Viewing Results

At the end of the modeling process a 3D graph will be generated for visualization purposes.

Results

Found common words across news articles within an event.
When clustering “residual” words via LDA, a lot of emotion words appear
Sources did not separate by topic
- MAYBE: sources use emotional words to describe the news; not consistent by event.

Future Directions

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.idea		.idea
csv		csv
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
main.py		main.py
nlp_demo.py		nlp_demo.py
plotting.py		plotting.py
raw_dataframe.csv		raw_dataframe.csv
registry_data.py		registry_data.py
requirements.txt		requirements.txt
source-scatter.html		source-scatter.html
topics-scatter.html		topics-scatter.html
tox.ini		tox.ini

Provide feedback