Skip to content

Financial Data Scraper: Archiving Yahoo Finance Data for Enhanced Access and Analysis

License

Notifications You must be signed in to change notification settings

ahnazary/Finance

Repository files navigation

Code style: black Imports: isort

Finance

drawing


For more detailes about the project, please refer to the documentation.

Description

This project is meant to scrape stocks financial data from yahoo finance for a wide range of companies (mostly US and EU based companies) and store them in a cloud based database. Since the databse is private, it cannot be accessed by publicly, but scheduled tasks extract financial data using yahoo api and loads the data in the database.

What problem does it solve?

Access to More Data:

  • Yahoo finance only provides last 4 quarters or years of financial data for a company. This project solves this problem by scraping the data from yahoo finance every quarter, storing all old records in a database as well as the new ones. therefore, the database contains all the financial data for a company since the scraping started, having more that last 4 quarters or years of data.

Easy Access to Data via SQL Queries:

  • Yahoo finance does not provide a way to download all the financial data for a wide range of companies at once. This project solves this problem by scraping the data from yahoo finance and storing them in a postgres database. Access to data is quick through SQL queries.

Ability to Filter Companies Based on Their Financial Data:

  • Yahoo finance does not provide a way to filter companies based on their financial data. This project solves this by enabling SQL queries to filter companies based on their financial data.

Backups

Once every month, the database is backed up and stored as parquet files in s3 bucket. The backup job is scheduled using github actions.

About

Financial Data Scraper: Archiving Yahoo Finance Data for Enhanced Access and Analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published