Skip to content

Python 3 script to crawl and spider websites for keywords via selenium

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



51 Commits

Repository files navigation

🕷️ SpiderSel 🕷️

Python 3 script to crawl and spider websites for keywords via selenium

Buy Me A Coffee

💎 Features

SpiderSel provides the following features:

  • Crawling of HTTP and HTTPS websites for keywords via Selenium (native JS support)
  • Spidering of new URLs found within source code (adjustable depth, stays samesite)
  • Filtering keywords by length and removing non-sense (paths, emails, protocol handlers etc.)
  • Storing keywords and ignored strings into a separate results directory (txt files)

Basically alike to CeWL or CeWLeR but with support for websites that require JavaScript.

🎓 Usage

usage: [-h] --url URL [--depth DEPTH] [--min-length MIN_LENGTH]

Web Crawler and Keyword Extractor

  -h, --help                  show this help message and exit
  --url URL                   URL of the website to crawl
  --depth DEPTH               Depth of subpage spidering (default: 1)
  --min-length MIN_LENGTH     Minimum keyword length (default: 4)
  --lowercase                 Convert all keywords to lowercase
  --include-emails            Include emails as keywords

🐳 Example 1 - Docker Run

External Dockerhub Image

docker run -v ${PWD}:/app/results --rm l4rm4nd/spidersel:latest --url --lowercase --include-emails

You will find your scan results in the current directory.

Local Docker Build Image

If you don't trust my image on Dockerhub, please go ahead and build the image yourself:

git clone && cd SpiderSel
docker build -t spidersel .
docker run -v ${PWD}:/app/results --rm spidersel --url https:/ --lowercase --include-emails

🐍 Example 2 - Native Python


# clone repository and change directory
git clone && cd SpiderSel

# optionally install google-chrome if not available yet
sudo dpkg -i google-chrome-stable_current_amd64.deb

# install python dependencies; optionally use a virtual environment (e.g. virtualenv, pipenv, etc.)
pip3 install -r requirements.txt


python3 --url --lowercase --include-emails

The extracted keywords will be stored in an output file within the results folder.