Skip to content

urlcrawler.py is a Python script that performs a web crawl for a spesific domain or domains list. This script finds all URLs under the domains.

Notifications You must be signed in to change notification settings

Mr0Wido/urlcrawler.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

urlcrawler.py

urlcrawler.py is a Python script that performs a web crawl for a domain or domain list. This script finds all URLs under the domains.

Installation

git clone https://github.com/Mr0Wido/urlcrawler.py.git
cd urlcrawler.py
python3 urlcrawler.py

Usage

python crawler.py -d test.com
python crawler.py -d test.com -o urls.txt
python crawler.py -l domains.txt

Options

Flags Description
-h --help Show this help message and exit.
-d --domain The domain to crawl. Example: https://test.com
-l --list File containing a list of domains to crawl.
-o --output The output file where the found URLs will be saved.

Requirments

requests
BeautifulSoup4

Notes

This script tries to find all URLs under a specific domain. However, some URLs may be generated by JavaScript or other dynamic content and may not be found by this script. Also, this script sends a large number of requests and this can create high load on the target server. Therefore, it should only be used on your own sites or sites where you have explicit permission.

About

urlcrawler.py is a Python script that performs a web crawl for a spesific domain or domains list. This script finds all URLs under the domains.

Topics

Resources

Stars

Watchers

Forks

Languages