How to use this package? 👀

Install the package from pip

pip install scrape-up

Scrape the required information, for example, one wants to extract the number of followers of a user.

# Import the required module
from scrape_up import github

# Instantiate an object with the username provided.
user = github.Users(username="nikhil25803")

# Call the followers function
print(user.followers())

# Output - '59'

The platforms and methods we cover 💫

GitHub
Instagram
Internshala
GitHub
Internshala
TimesJobs

GitHub

from scrape_up import github

Scrape User details

First, create an object of class Users

user = github.Users(username="nikhil25803")

Methods	Details
`.followers()`	Returns the number of followers of a user.
`.following()`	Returns the number of following of a user.
`.get_avatar()`	Returns the avatar URL of a user.
`.get_bio()`	Returns the bio of a user.
`.get_repo()`	Returns the list of pinned repositories for a user.
`.repo_count()`	Returns the number of Repositories of a user.
`.star_count()`	Returns the number of stars of a user.
`.get_yearly_contributions()`	Returns the number of contributions made in 365 days frame.
`.get_repositories()`	Returns the list of repositories of a user.
`.get_starred_repos()`	Return the list of starred repositories of a user.
`.pul_requests()`	Return the number of pull requests opened in a repository.
`.get_followers()`	Returns the list of followers of a user.
`.get_following_users()`	Returns the list of users followed by a user.
`.get_achievements()`	Returns the list of achievements of a user.
`.get_status()`	Returns the status of a user.
`.get_contribution_streak()`	Returns the maximum contribution streak of a user in the past year starting from the current date.
`.get_repository_details()`	Returns the list of repositories with their details.
`.get_branch()`	Returns the list of branches in a repository.

Example:

bio = user.get_bio() #user var taken from above example
print(bio)

Scrape Repository details

First, create an object of class Repository

repository = github.Repository(username="nikhil25803", repository_name="scrape-up")

Methods	Details
`.fork_count()`	Returns the number of forks of a repository.
`.get_contributors()`	Returns the number of contributors of a repository.
`.topics()`	Returns the topics of a repository.
`.pull_requests()`	Returns the number of pull requests opened in a repository.
`.last_updated_at()`	Returns the last updated date of a repository.
`.tags()`	Returns the last ten tags of a repository.
`.releases()`	Returns the last ten releases of a repository.
`.issues_count()`	Returns number of issues in a repository
`.readme`	Saves the readme.md file of the given user to the current working directory. To view the readme.md with a live server, change ".md" to ".html" in "readme.md".
`.get_pull_requests_ids()`	Returns all ids of opened pull requests in a repository.
`.get_issues()`	Returns the list of all open issues in a repository.
`.commits()`	Returns the number of commits in a repository.
`.get_readme()`	Returns & saves README.md file of the special repository (if exists)
`.get_environment()`	Returns the latest deployed link of a repository (if exists).
`.watch_count()`	Returns the number of watchers of a repository
`.all_watchers()`	Returns the username of all watches of a repository

Example:

fork_count = repository.fork_count() #repository var taken from above example
print(fork_count)

Scrape an issue details

First, create an object of class Issue

repository = github.Issue(username="nikhil25803", repository_name="scrape-up", issue_number=59)

Methods	Details
`.assignees()`	Returns the assignees of an issue.
`.labels()`	Returns the labels of an issue.
`.opened_by()`	Returns the name of the user, who opened the issue.
`.title()`	Returns the title of an issue.
`.is_milestone()`	Returns the milestone, if the issue is part of one or 'No milestone', if it's not.
`.opened_at()`	Returns a string containing the time when the issue was opened in ISO format.

Example:

assigned = repository.assignees() #user var taken from above example
print(assigned)

Scrape a pull request details

First, create an object of class PullRequest

repository = github.PullRequest(username="nikhil25803", repository_name="scrape-up", pull_request_number=30)

Methods	Details
`.commits()`	Returns the number of commits made in a pull request.
`.title()`	Returns the title of a pull request.
`.labels()`	Returns all the labels of a pull request, empty list in case of no labels.
`.files_changed()`	Returns the number of files changed in a pull request.
`.reviewers()`	Return the list of reviewers assigned in a pull request.

Example:

files_changed = repository.files_changed() #user var taken from above example
print(files_changed)

Scrape the details of an organization

First, create an object of class Organization

repository = github.Organization(organization_name="Clueless-Community")

Methods	Details
`.top_topics()`	Returns a list of the most used topics in an organization.
`.followers()`	Returns the number of followers of an organization.
`.top_languages()`	Returns the top languages used in an organization.
`.followers()`	Returns the number of followers of an organization.
`.avatar()`	Returns the avatar URL of an organization.
`.repositories()`	Returns the list of repositories of an organization.
`.people()`	Returns the list of people in an organization.
`.peoples()`	Returns the number of people in an organization.
`.get_location()`	Returns the location of an organization.
`.repository_details()`	Returns the list of repositories with their details.
`.pinned_repository()`	Returns the list of pinned repositories with their details.
`.get_organization_links()`	Returns a dictionary of important website links of a community.

Example:

top = repository.top_topics() #user var taken from above example
print(top)

Gitlab

from scrape_up import gitlab

Scrape up users details

First, create an object of the User class:

user = gitlab.Users(username="example_user")

Methods	Details
`.get_name()`	Returns the name of the user.
`.get_bio()`	Returns the bio of the user.
`.get_avatar_url()`	Returns the avatar URL of the user.
`.get_repositories()`	Returns a list of repositories owned by the user.
`.get_project_details(project_id)`	Returns the details of a specific project owned by the user.

Example:

name_result = user.get_name()
print("Name:", name_result["data"])
print("Status:", name_result["message"])

Scrape Repository Details

First, create an object of the Repository class:

repository = gitlab.Repository(username="example_user", repository_name="example_repository")

Methods	Details
`.get_name()`	Returns the name of the repository.
`.get_description()`	Returns the description of the repository.

Example:

name_result = repository.get_name()
print("Repository Name:", name_result["data"])

Scrape Organization Members

First, create an object of the Organization class:

organization = gitlab.Organization(organization_name="example_organization")

Methods	Details
`.get_members()`	Returns a list of usernames of the members in the organization.
`get_projects()`	Returns a list of project names associated with the organization.

Example:

members = organization.get_members()
print("Organization Members:", members)

projects = organization.get_projects()
print("Organization Projects:", projects)

Scrape Issues

To scrape information about an issue on GitLab, create an object of the Issue class by providing the following parameters:

username: The GitLab username of the repository owner.
repository: The name of the repository.
issue_number: The number of the issue.

Here's an example of creating an object of the Issue class:

issue = gitlab.Issue(username="example_user", repository="example_repository", issue_number=123)

Methods	Details
`.get_title()`	Returns the title of the issue.
`.get_description()`	Returns the description of the issue.
`.get_author()`	Returns the author of the issue.

Example:

title = issue.get_title()
print("Issue Title:", title["data"])

description = issue.get_description()
print("Issue Description:", description["data"])

author = issue.get_author()
print("Issue Author:", author["data"])

Scrape Pull Requests

To scrape pull request details from GitLab, create an object of the PullRequest class:

pull_request = gitlab.PullRequest(username="example_user", repository="example_repository", pull_request_number=123)

Methods	Details
`.get_title()`	Returns the title of the pull request.
`.get_description()`	Returns the description of the pull request.
`.get_author()`	Returns the author of the pull request.

Example:

title = pull_request.get_title()
print("Pull Request Title:", title)

description = pull_request.get_description()
print("Pull Request Description:", description)

author = pull_request.get_author()
print("Pull Request Author:", author)

Instagram

from scrape_up import instagram

Scrape User details

First, create an object of the class User

user = instagram.User(username="nikhil25803")

Methods	Details
`.user_details()`	Returns the number of followers of a user.

Example:

print(user.user_details()) #user var taken from above

Internshala

from scrape_up.internshala.internships import Internships

Scrape Internship details

Create an object for the 'Internships' class:

scraper = Internships()

Methods	Details
`.internships()`	Scrapes and returns a list of dictionaries representing internships.

Example:

scraper = Internships()
internships = scraper.scrape_internships()
for internship in internships:
    print(internship)

KooApp

from scrape_up import kooapp

Scrap up the kooapp user's detail

Create an instance of KooUser class.

user = kooapp.KooUser('krvishal')

Methods	Details
`.get_name()`	Returns the name of the user.
`.get_bio()`	Returns the bio of the user.
`.get_avatar_url()`	Returns the URL of the first avatar of the user.
`.followers()`	Returns the number of followers of a user.
`.following()`	Returns the number of people the user is following.
`.get_social_profiles()`	Returns all the connected social media profiles of the user.
`.get_profession()`	Returns the title/profession of the user.

Example:

name = user.get_name() # user variable is taken from above example
print(name)

Medium

from scrape_up import medium

Scrape user details

First, create an object of class User

user = medium.Users(username="nikhil25803")

Methods	Details
`.get_articles()`	Returns the article titles of the users.

Example

articles = user.get_articles() #user var taken from above
for article in articles:
    print(article) #For better visibility/readability

Scrape trending articles

Methods	Details
`.get_trending()`	Returns the trending titles of the medium.

Example

Trending.get_trending() #Prints the trending titles

Scrape publication details

First, create an object of class Publication

publication = medium.Publication(link="https://....")

Methods	Details
`.get_articles()`	Returns a list of articles of the given publication.

Example

articles = publication.get_articles() #publication var taken from above
for article in articles:
    print(article) #For better visibility/readability

Hacker News

from scrape_up import hacker_news

Scrap up Hacker News's latest articles

Create an instance of Article class.

articles = hacker_news.Article()

Methods	Details
`.articles_list()`	Returns the latest articles along with their links in JSON format.

Example:

article = Article()
print(article.articles_list())

Twitter

from scrape_up import twitter

Scrape

First, create an object of class TwitterScraper

twitter_scraper = TwitterScraper()

Methods	Details
`.unametoid(username)`	Returns the numerical_id on passing username.
`.idtouname(numerical_id)`	Returns the username on passing numerical_id.

Leetcode

from scrape_up import leetcode

Scrape user details

First, create an object of class LeetCodeScraper

leetcode_scraper = LeetCodeScraper(username="nikhil25803")

User Specific Methods - Require Username

Methods	Details
`.scrape_rank()`	Used to scrape the rank of a user on LeetCode.
`.scrape_rating()`	Used to scrape the rating of a user on LeetCode.
`.get_problems_solved()`	Used to scrape total problems solved by a user on LeetCode.
`.get_solved_by_difficulty()`	Used to scrape difficulty wise problems solved by a user on LeetCode.
`.get_github_link()`	Used to scrape github link of a user on LeetCode.
`.get_linkedin_link()`	Used to scrape linkedin link of a user on LeetCode.
`.get_community_stats()`	Used to scrape community stats of a user on LeetCode.

General Purpose Methods - Does not Require Username

Methods	Details
`.get_problems(difficulty, tags_list, search_key)`	Used to scrape top problems of LeetCode based on filters. Difficulty is string from ("easy", "medium", "hard"). Tags_list is list of tags. Search_key is string to search. All ther parameters are optional.
`.get_contests()`	Used to scrape the upcoming LeetCode Contests details.
`.get_daily_challenge()`	Used to scrape LeetCode Daily Challenge details.

Finance

from scrape_up import StockPrice

Scrape stock data

First, create an instance of class StockPrice with stock name and index name.

infosys = StockPrice('infosys','nse')

Methods	Details
`.get_latest_price()`	Returns the latest stock price of the given stock name.
`.get_historical_data(from_date,to_date)`	Returns stock price from `from_date` to `to_date` in format (date in format dd-mm-yyyy)

Example

# all data returned in dictionary format
latest_info = infosys.get_latest_price() # infosys var taken from above
historical_data = infosys.get_historical_data('02-05-2023', '31-05-2023')

IMDb

Scrap up IMDb Top 250 details

Create an instance of the Movie class.

top_250 = IMDB()

Methods	Details
`.top_rated()`	Returns the top-rated movies listed on IMDB.

Coursera

from scrape_up import Coursera

Scrape Courses Details

Create an object of the 'Courses' class:

scraper = Courses("courses","page_count")

Methods	Details
`.titles()`	Returns the titles of courses

Example

# All data returned in dictionary format
javaCourses = Courses("java", 4)  # Keyword,Pages
print(javaCourses.titles())
 #For better visibility/readability

Wikipedia

from scrape_up import Wikipedia

Scrape Wikipedia Details

Create an object of the 'WikipediaScrapper' class:

Scraper = WikipediaScraper(url)

Methods	Details
`.scrape()`	Returns the Scraped Data from Wikipedia

Example

# Returning the data
scraped_data = scraper.scrape()
print(scraped_data)

Amazon

Scrape details about a product

Create an instance of Product class with a product_name propertiese.

product = Product(product_name="watch")

Methods	Details
`.get_product()`	Returns product data(links).
`.get_product_details()`	Returns product detail.
`.get_product_image()`	Returns product image.
`.customer_review()`	Returns product review.

Amazon-Kindle Bookstore

Scrape details of a book

Create an instance of Book class.

books = AmazonKindle()

Methods	Details
`.bestsellers()`	Returns the list of best seeling books on AmazonKindle

Flipkart

Scrape details of products

Create an instance of Flipkart class.

item = Flipkart()

Methods	Details
`.TVs()`	Returns the list of TV sets on flipkart
`.BestsellersBooks()`	Returns the list of Bestseller items on flipkart
`.scrapdatamobiles()`	Returns the list of the mobile phone which are under 50k flipkart

=======

Methods	Details
`.TVs()`	Returns the list of TV sets on flipkart
`.BestsellersBooks()`	Returns the list of Bestseller items on flipkart
`.SportsShoes()`	Returns the list of sprt shoes listed on Flipkart

Spotify

Scrape up songs

Create an instance of Spotify class.

scraper = Spotify()

Methods	Details
`.scrape_songs_by_keyword()`	Returns the list of songs that are related to the keyword
`.scrape_homepage()`	Returns the list of playlists on the homepage
`.close()`	To close the chrome tab that is showing results

Ask Ubuntu

Scrape questions, views, votes, answer counts, and descriptions from Ask Ubuntu website regarding a topic

Create an instance of AskUbuntu class.

questions = AskUbuntu("topic")

Methods	Details
`.getNewQuestions()`	Returns the new questions, views, votes, answer counts, and descriptions in JSON format
`.getActiveQuestions()`	Returns the active questions, views, votes, answer counts, and descriptions in JSON format
`.getUnansweredQuestions()`	Returns the unanswered questions, views, votes, answer counts, and descriptions in JSON format
`.getBountiedQuestions()`	Returns the bountied questions, views, votes, answer counts, and descriptions in JSON format
`.getFrequentQuestions()`	Returns the frequently asked questions, views, votes, answer counts, and descriptions in JSON format
`.getHighScoredQuestions()`	Returns the most voted questions, views, votes, answer counts, and descriptions in JSON format

Example

que = AskUbuntu("github")
scrape = que.getNewQuestions()

=======

EazyDiner

Scrape restaurants name, location, rating, cuisine and prices from eazydiner website for a given city

Create an instance of EazyDiner class.

restaurants = EazyDiner(location="city-name")

Methods	Details
`.getRestaurants()`	Returns the restaurants name, location, rating, cuisine and prices in JSON format. Check the cities which are accepted in the eazydiner website
`.getBreakfast()`	Returns the restaurants name, location, rating, cuisine and prices in JSON format for Breakfast.
`.getLunch()`	Returns the restaurants name, location, rating, cuisine and prices in JSON format for Lunch.
`.getDinner()`	Returns the restaurants name, location, rating, cuisine and prices in JSON format for Dinner.

Example

blr = EazyDiner("south-bengaluru")
scrape = blr.getRestaurants()

Stack Overflow

Scrape questions, views, votes, answer counts, and descriptions from Stack Overflow website regarding a topic

Create an instance of StackOverflow class.

questions = StackOverflow("topic")

Methods	Details
`.getNewQuestions()`	Returns the new questions, views, votes, answer counts, and descriptions in JSON format
`.getActiveQuestions()`	Returns the active questions, views, votes, answer counts, and descriptions in JSON format
`.getUnansweredQuestions()`	Returns the unanswered questions, views, votes, answer counts, and descriptions in JSON format
`.getBountiedQuestions()`	Returns the bountied questions, views, votes, answer counts, and descriptions in JSON format
`.getFrequentQuestions()`	Returns the frequently asked questions, views, votes, answer counts, and descriptions in JSON format
`.getHighScoredQuestions()`	Returns the most voted questions, views, votes, answer counts, and descriptions in JSON format

Example

que = StackOverflow("github")
scrape = que.getNewQuestions()

Tech Crunch

Scrape articles with title, descriptions, images, date and link regarding a category

Create an instance of TechCrunch class.

articles = TechCrunch("category")

Methods	Details
`.getArticles()`	Returns the articles with title, descriptions, images, date and link in JSON format

Example

art = TechCrunch("fintech")
scrape = art.getArticles()

YouTube

Scrape Video

Scrape video details with title, descriptions, views count, upload date, comment count, channel name, channel avatar, subscriber_count, channel_url

Create an instance of Video class.

vid = YouTube("video_url")

Methods	Details
`.getDetails()`	Returns the video details with title, descriptions, views count, upload date, comment count, channel name, channel avatar, subscriber_count, channel_url in JSON format

Example

git = YouTube("https://www.youtube.com/watch?v=pBy1zgt0XPc")
scrape = git.getDetails()

Google News

Scrape articles with title, descriptions, news source, date and link regarding a topic

Create an instance of GoogleNews class.

articles = GoogleNews("topic")

Methods	Details
`.getArticles()`	Returns the articles with title, descriptions, news source, date and link in JSON format

Example

art = GoogleNews("github")
scrape = art.getArticles()

TimesJobs

from timesjobs_scraper import TimesJobs

Scrape Job Details

First, create an object of the class Job and specify the domain to which you want to apply.

Job = TimesJobs('example')

Methods	Details
`.scrape()`	Returns the various details regarding the companies based on the Job-role as JSON data.

Example:

jobs = TimesJobs('Python')
job_data=jobs.scrape()
if job_data:
    print(job_data)

Dev Community

Scrape latest articles from home page

Scrape latest articles based on a tag

Scrape user data, all articles written by a user and pinned articles written by a user

Create an instance of DevCommunity class.

dev = DevCommunity('francescoxx')

Methods	Details
`.all_articles()`	Returns latest articles from the home page of DevCommunity.
`.__strTag__()`	Returns name of the tag specified whose articles we want returned.
`.tag_articles()`	Returns latest articles which have the specified tag in DevCommunity.
`.__strUser__()`	Returns username of the user.
`.user_details()`	Returns the user details.
`.pinned_articles()`	Returns all pinned articles which have been written by the user.
`.user_articles()`	Returns all articles written by the user.

Example

dev = DevCommunity('francescoxx')
articles = dev.all_articles()
pprint(articles)

tag_name = dev.__strTag__(tag='python')
tagged_articles = dev.tag_articles(tag='python')
print(tag_name)
pprint(tagged_articles)

user = dev.__strUser__()
user_detail = dev.user_details()
pin_articles = dev.pinned_articles()
user_article = dev.user_articles()
print(user)
print(user_detail)
pprint(pin_articles)
pprint(user_article)

Goodreads

Scrape all details of a book

Create an instance of BookScraper Class.

# Using author and Book_title
book = BookScraper(book_title="The Hunger Games",author="Suzanne Collins")
# or using book_id
book = BookScraper(book_id=2767052)

Methods	Details
`.get_title()`	Returns the book title.
`.get_author()`	Returns name of the author.
`.get_description()`	Returns description of book.
`.get_genres()`	Returns list containing genres of book.
`.get_edition_details()`	Returns all edition details like numeber of pages, ISBN number etc.
`.get_all_details()`	Returns a dictionary containing all the abover details.

Example:

print(book.get_description())
print(book.get_all_details())

Getting book id of a book from Goodreads.

Go to web page of particular book on Goodreads Platform.
Copy the following number from the url of page of book i.e. which is after https://www.goodreads.com/book/show/ and before book title.
Example:
url -> https://www.goodreads.com/book/show/`2767052`-the-hunger-games book id -> 2767052

Files

documentation.md

Latest commit

History

documentation.md

File metadata and controls

How to use this package? 👀

The platforms and methods we cover 💫

GitHub

Scrape User details

Scrape Repository details

Scrape an issue details

Scrape a pull request details

Scrape the details of an organization

Gitlab

Scrape up users details

Scrape Repository Details

Scrape Organization Members

Scrape Issues

Scrape Pull Requests

Instagram

Scrape User details

Internshala

Scrape Internship details

KooApp

Scrap up the kooapp user's detail

Medium

Scrape user details

Scrape trending articles

Scrape publication details

Hacker News

Scrap up Hacker News's latest articles

Twitter

Scrape

Leetcode

Scrape user details

Finance

Scrape stock data

IMDb

Scrap up IMDb Top 250 details

Coursera

Scrape Courses Details

Wikipedia

Scrape Wikipedia Details

Amazon

Scrape details about a product

Amazon-Kindle Bookstore

Scrape details of a book

Flipkart

Scrape details of products

Spotify

Scrape up songs

Ask Ubuntu

Scrape questions, views, votes, answer counts, and descriptions from Ask Ubuntu website regarding a topic

EazyDiner

Scrape restaurants name, location, rating, cuisine and prices from eazydiner website for a given city

Stack Overflow

Scrape questions, views, votes, answer counts, and descriptions from Stack Overflow website regarding a topic

Tech Crunch

Scrape articles with title, descriptions, images, date and link regarding a category

YouTube

Scrape Video

Scrape video details with title, descriptions, views count, upload date, comment count, channel name, channel avatar, subscriber_count, channel_url

Google News

Scrape articles with title, descriptions, news source, date and link regarding a topic

TimesJobs

Scrape Job Details

Dev Community

Scrape latest articles from home page

Scrape latest articles based on a tag

Scrape user data, all articles written by a user and pinned articles written by a user

Goodreads

Scrape all details of a book

Getting book id of a book from Goodreads.