A Sentiment Analysis of COVID-19 Tweets and its effects on Likeability and Retweet-ability

Data Analytics (Capstone) Project for Ryerson University CIND 820, Fall 2020

Overview

This capstone project was created for Ryerson University course CIND 820 - Big Data Analyst Project.

In this project, we attempted to create new features using NLTK library to and then build regression models to predict whether sentiment score contribute to retweets or likes with the intention to better understand whether a tweet’s sentimentality can influence how it resonates with other readers.

Approach

The summary of approach is as follows:

Download and Prepare Dataset
Hydrate Dataset
Preload Setup
Import Dataset
Perform Data Cleaning
Conduct Sentiment Analysis
Conduct Basic Analysis
Build Prediction Model for "Favourite Count" with Linear Regression
Build Prediction Model for "Retweet Count" with Linear Regression
Build Linear Regression with k-Fold
Build Polynomial Regression
Build Polynomial Regression with k-Fold

Results

Our regression models were able to predict favorite_count and retweet_count to some extent. However, based on observations from sampling the scatterplots, there are reasonable doubts that under count of less than 20 for favorite_count and less than 25 for retweet_count will not yield any meaningful prediction. Favorite_count and retweet_count is highly correlated with each other and has the highest coefficient; significantly contribute to each other’s prediction model.

For more details, read AndyLee-FinalReport.docx.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitattributes		.gitattributes
AndyLee-FinalReport.docx		AndyLee-FinalReport.docx
AndyLee-FinalReport.pdf		AndyLee-FinalReport.pdf
AndyLee-NLP.html		AndyLee-NLP.html
AndyLee_NLP.ipynb		AndyLee_NLP.ipynb
PPT-AndyLee.pdf		PPT-AndyLee.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Sentiment Analysis of COVID-19 Tweets and its effects on Likeability and Retweet-ability

Overview

Approach

Results

About

Releases

Packages

Languages

danieljai/CIND820-Capstone-SentimentalAnalysis

Folders and files

Latest commit

History

Repository files navigation

A Sentiment Analysis of COVID-19 Tweets and its effects on Likeability and Retweet-ability

Overview

Approach

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages