An analysis of Covid19 abstracts using NLP
An analysis of a approximately 45,750 journal articles (metadata.csv) pertaining to different coronvirusses and respiratory diseases. The abstract data is cleaned and subjected to analysis through tfidf, k mean clustering, and hierarchical clustering. Trends in the abstract datas are further investigated, insights gained from this analysis are used to draw conclusions about policy and health care decisions.
This project was conducted as part of a course requirement for a University of Toronto Data Science and Analytics course.