-
-
Notifications
You must be signed in to change notification settings - Fork 108
Home
Joseph Lai edited this page May 20, 2021
·
21 revisions
__ __ _ __ ____
/\ \/\ \/\`'__\/',__\
\ \ \_\ \ \ \//\__, `\
\ \____/\ \_\\/\____/
\/___/ \/_/ \/___/... Universal Reddit Scraper
This is a comprehensive Reddit scraping tool that integrates multiple features:
- Scrape Reddit via
PRAW
(the official Python Reddit API Wrapper)- Scrape Subreddits
- Scrape Redditors
- Scrape submission comments
- Livestream Reddit via PRAW
- Livestream submissions submitted within Subreddits or by Redditors
- Livestream comments submitted within Subreddits or by Redditors
- Livestream trending submissions within Subreddits
- Scrape Reddit via the
Pushshift API
- Search for keywords in all publicly available submissions
- Search for keywords in all publicly available comments
- Analytical tools for scraped data
- Generate frequencies for words that are found in submission titles, bodies, and/or comments
- Generate a wordcloud from scrape results
You can scrape Reddit with or without API credentials; however, I strongly advise taking some time to get your credentials in order to take advantage of the full suite of tools available within URS.
Here is a table describing which tools do or do not require API credentials:
Requires Credentials (PRAW) | Does Not Require Credentials (Pushshift) |
---|---|
Scrape Subreddits | Search for keywords in submissions |
Scrape Redditors | Search for keywords in comments |
Scrape submission comments | |
Livestream Subreddits | |
Livestream Redditors | |
Livestream trending submissions within Subreddits |
See the Getting Started section to get your API credentials.
NOTE: Requires Python 3.7+
git clone --depth=1 https://github.com/JosephLai241/URS.git
cd URS
pip3 install . -r requirements.txt
You may run into an error that looks like this:
Traceback (most recent call last):
File "/home/joseph/URS/urs/./Urs.py", line 30, in <module>
from urs.utils.Logger import LogMain
ModuleNotFoundError: No module named 'urs'
This means you will need to add the URS
directory to your PYTHONPATH
. Here is a link that explains how to do so for each operating system.