Automation Playwright and Python to automate access to the VTOP portal, including login, CAPTCHA solving, and data extraction.
main.py
: Orchestrates the automation, login, CAPTCHA solving, and data extraction.captcha_solver.py
: Handles CAPTCHA solving (Gemini API and local OCR).data_processor.py
: Cleans HTML and converts it to CSV.data/
: Stores cleaned HTML and final CSV output.temp/
: Stores temporary files (e.g., raw HTML, CAPTCHA images).
-
Install dependencies:
pip install -r requirements.txt
Or use
pyproject.toml
with your preferred tool (e.g.,pip
,poetry
). -
Create a
.env
file with your VTOP credentials:GOOGLE_API_KEY=gemini_api_key VTOP_USERNAME=your_username VTOP_PASSWORD=your_password
Run the main script:
python main.py
This repository includes an example that extracts the academic calendar from VTOP and saves it as a CSV file. You can adapt the code to automate other tasks on VTOP as needed.