About this project

The main idea of project is to make efficient fundamental data scraper which will provide accurately sorted financial information.

At the current state the scraper is a fully functioning and written using Scrapy library. The data is scrapped only from 2011 onward.

The labeling script decides on the document type and stores it in parsed folder.

The Aggregation script is in very raw stage of progress and I will be working on it in the upcoming month.

Getting Started

Copy the repository and Install requirement.txt using pip

To scrape data run scraper.py and pass symbols of companies you want to scrape and year python scrape.py.

The scraped files are stored in scraped folder

To label all scraped files just run label.py.

Contribution

I am actively seeking contributors to improve efficiency, structure and functionality.

License

This project is licensed under the terms of the MIT license.

"# SEC-EDGAR-python-scraper"

A note

Also, I am third year finance major, and been learning programming for less than a year, therefore the code inefficiencies and the structure might look out of place as I am not familiar with many programming convensions.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
ALL		ALL
files		files
model		model
parsed		parsed
sec		sec
Aggregate.py		Aggregate.py
Label.py		Label.py
Sorted.xlsx		Sorted.xlsx
readme.md		readme.md
requirements.txt		requirements.txt
scrape.py		scrape.py
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this project

Getting Started

Contribution

License

A note

About

Releases

Packages

Languages

jcwill415/SEC-EDGAR-python-scraper

Folders and files

Latest commit

History

Repository files navigation

About this project

Getting Started

Contribution

License

A note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages