Skip to content

rachan1637/detective

Repository files navigation

Text Data Mining for Detective Stories

The Repo is created for STA490 (Statistical Consulting) Course Project. In this project, we are working as statistical consultants, doing the analysis for the collaborators and submitting a final report at the end of the course.

My collaborators are from English Literature Department. They provide an anotated dataset with many cateogrical variables and the text for hundreds of detective stories, and wish we can find some interesting facts from the data.

I focus on doing text data mining to discover interesting facts from the text of the detective stories.

The final report focus on the evolution of detective stories. We try to uncover the evolution by clustering through tf-idf embeddings and doing lexicon-based sentiment analysis as well as hypothesis testing.

There are more things I would like to do after the project ends. Specifically, I'm strongly interested in Topic Modeling for this data. The idea may be implemented in summer 2022.

About

Text Data Mining for Detective Stories

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages