NLP on deep-learning-papers

This a try to do topic modellng on best 100 papers from github repo awesome-deep-learning-papers.

From the repo, we should have 100 papers but during the crawling with script, the access towards one of them (Human-level control through deep reinforcement learning) is blocked.

Then, a script and pdftotext is used to parse pdfs to plain texts.
In find_topics.py, we concatenate all plain texts to papers.txt which is of size 4 MB. This means there is about 4000000 characters in the data.
The gensim library is used as it is tailored for topic modelling. The findings are visualized by pyLDAvis library and stored as .html.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NLP on deep-learning-papers

Files

README.md

Latest commit

History

README.md

File metadata and controls

NLP on deep-learning-papers