Keats Crawler

This is a python module for downloading videos and other resources from KCL's Keats e-learning platform.

By default, it remembers the content you've already downloaded, so will skip it subsequently when you need to get the more recent files. Works on most module pages but there are some edge cases where it might run into issues.

NEW: Now with the ability to download videos from Microsoft Stream.

Requirements

Python 3
Works on Linux, Windows WSL, macOS

Installation

Clone this repository: git clone https://github.com/mannmann2/keats_crawler.git
cd keats_crawler
pip install -r requirements.txt
Download chromedriver for your version of Chrome
sudo apt install ffmpeg

Usage

Self enrol in the Keats module, if not already enrolled
Update config.py (See below. Cookies must be updated each time your session expires)
Run: python crawl.py

Usage for downloading videos from Microsoft Streams

Get video links and access token from https://web.microsoftstream.com
Run: python msstream.py

Config settings

MODULE: Name of module and the folder in which to download files
URLS: Mapping between Module names and their Keats urls
PATH: Location in which to create the module folder
PATH_TO_CHROMEDRIVER: Location of chromedriver executable
COOKIES: Copy and add cookies from your browser after logging into Keats. These can be found by navigating to the Network tab of the browser inspector.
DOWNLOAD_RESOURCES: True/False - Download the non-video resources (ppt, pdf, py, etc)
DOWNLOAD_VIDEOS: True/False - Download videos embedded in Keats (Won't work for videos linked on some other website)

VIDEO_PROMPT: True/False - Prompt before extracting each video for download (Disabling this will automatically download all extracted videos)
VIDEO_LIMIT: Integer or None - Limit the number of videos extracted
SKIP_DUPLICATES: True/False - To skip files already downloaded (Only works if the previous downloads occurred using this package)
REMEMBER_DOWNLOADS: True/False - Add files being downloaded in current crawl to a duplicate filter (Used to check duplicates)

MS_STREAMS_LINKS: Links to videos on Microsoft Stream
MS_STREAMS_ACCESS_TOKEN: Authorization token to Microsoft Stream internal API

Free software: MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
keats_crawler		keats_crawler
.gitignore		.gitignore
AUTHORS.rst		AUTHORS.rst
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keats Crawler

Requirements

Installation

Usage

Usage for downloading videos from Microsoft Streams

Config settings

About

Releases

Packages

Languages

License

LittleHellcat13/keats_crawler

Folders and files

Latest commit

History

Repository files navigation

Keats Crawler

Requirements

Installation

Usage

Usage for downloading videos from Microsoft Streams

Config settings

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages