Booker Scraper

Run pipenv install to install dependencies.
Run pipenv shell to enter virtual environment.

Process

1. Login

.env

BOOKER_ACCOUNT=
BOOKER_EMAIL=
BOOKER_PASSWORD=
ASP_NET_SESSION=
ASPXAUTH=

Run login.py every time spiders return non 200 response and copy printed values into .env.

2. Sitemap

Sitemap manually copied from side nav pane. Could be automated.

3. Product List

From list view scrap all product code's and other info available. scrapy crawl product_list

Load outputted CSV file into Database for the following step!

4. Product Detail

From the aforementioned step we have the product_list table which we now use to scrap each product page using the code. scrapy crawl product_detail

Load the data into database.

5. View the database

Run the barcode.py script to generate a CSV file of all the products in the database.

6. View the database

SQLite views collate data which can be exported to CSV.

Name		Name	Last commit message	Last commit date
Latest commit History 274 Commits
.vscode		.vscode
booker		booker
dist		dist
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
barcode.py		barcode.py
cats.py		cats.py
image.py		image.py
login.py		login.py
not_found.py		not_found.py
scrapping-in-action.png		scrapping-in-action.png
scrapy.cfg		scrapy.cfg
stores.db		stores.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Booker Scraper

Process

1. Login

2. Sitemap

3. Product List

4. Product Detail

5. View the database

6. View the database

About

Contributors 2

Languages

james-innes/booker-scraper

Folders and files

Latest commit

History

Repository files navigation

Booker Scraper

Process

1. Login

2. Sitemap

3. Product List

4. Product Detail

5. View the database

6. View the database

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages