Skip to content

victoryg739/Data-Science-SC1015-MiniProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SC1015-MiniProject

Contributors

  • @victoryg739 - Classification Model, EDA, Data Preparation and Data Cleaning, Data Extraction and Slides

  • @Nivlek06 - Practical Motivation, EDA, Slides and Conclusion

  • @Roseus9 - Practical Motivation, EDA, Slides and Conclusion


Problem Defintion

  • Can we predict stock prices with online comments/trends on social media?

Solution Approach

  • By using classification model to check whether r/WallStreetBets post can dictate stock prices.

Models Used

  1. VADER Sentiment Analysis
  2. Gradient Boosted Tree

Conclusion

We took a closer look at how r/WallStreetBets interacts with the stock market. We examined activity on the subreddit and conducted an analysis as to whether or not there was any significant correlation between sentiment score of stocks and and there price movement. We found that the classification model does provide some predictive power in our dataset.

To further test our model, it is recommended that we forward test our classification model on real-time market data and test our model on more recent r/wallstreetbets data.


What did we learn from this project?

  • Natural Language processing (NLP) based on VADER (Valence Aware Dictionary and Sentiment Reasoner)
  • Classifcation modal using gradient boosted tree
  • Pearson Correlation (r-value and p-value)
  • Data Cleaning for Sentiment Analysis (lemmatization, stopwords and etc)

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages