urlcrawler.py

urlcrawler.py is a Python script that performs a web crawl for a domain or domain list. This script finds all URLs under the domains.

Installation

git clone https://github.com/Mr0Wido/urlcrawler.py.git
cd urlcrawler.py
python3 urlcrawler.py

Usage

python crawler.py -d test.com
python crawler.py -d test.com -o urls.txt
python crawler.py -l domains.txt

Options

Flags		Description
-h	--help	Show this help message and exit.
-d	--domain	The domain to crawl. Example: https://test.com
-l	--list	File containing a list of domains to crawl.
-o	--output	The output file where the found URLs will be saved.

Requirments

requests
BeautifulSoup4

Notes

This script tries to find all URLs under a specific domain. However, some URLs may be generated by JavaScript or other dynamic content and may not be found by this script. Also, this script sends a large number of requests and this can create high load on the target server. Therefore, it should only be used on your own sites or sites where you have explicit permission.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
urlcrawler.py		urlcrawler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

urlcrawler.py

Installation

Usage

Options

Requirments

Notes

About

Uh oh!

Languages

Mr0Wido/urlcrawler.py

Folders and files

Latest commit

History

Repository files navigation

urlcrawler.py

Installation

Usage

Options

Requirments

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages