Scraping Banglapedia Data
In this repository I have built a crawler for extracting all the data from Banglapedia website.
I have extracted the following details:
- Title of the article.
- Main text body
- Image URLs if there is any.
- Source URL.
- Published date of the article. Also, I have set an ID number which is just for numbering my accessed data.
After extracting the informations I saved it into a csv file, you can also save it in a json file. For saving a file you can write the command on your terminal:
scrapy crawl bangla -o bangladata.csv (for saving as a csv file) or scrapy crawl bangla -o bangladata.json (for saving as a json file)
Requirements: