Waybackurls is a Python script that retrieves URLs from the Wayback Machine for one or more hosts. It allows you to search for URLs with or without subdomains and saves the results as TEXT files. 📝
This script is based on the work of mhmdiaa. The original version of the script can be found in their GitHub Gists.
- 🔍 Retrieve URLs from the Wayback Machine for multiple hosts concurrently
- 🌐 Option to include or exclude subdomains in the search
- 📁 Customizable output directory for the JSON files
- 🚀 Adjustable number of concurrent threads for faster processing
- ℹ️ Detailed output and timing information
- Python 3.6 or higher 🐍
requests
library 🔗
-
Clone the repository:
git clone https://github.com/your-username/waybackurls.git
-
Change to the project directory:
cd waybackurls
-
Install the required dependencies:
pip install -r requirements.txt
python waybackurls.py [-h] [-s] [-o OUTPUT] [-t THREADS] hosts [hosts ...]
hosts
: One or more hosts to retrieve URLs for (required) 🌐-s
,--subdomains
: Include subdomains in the search (optional) 🔍-o OUTPUT
,--output OUTPUT
: Output directory for the JSON files (default: results) 📁-t THREADS
,--threads THREADS
: Number of concurrent threads (default: 5) 🚀
-
Retrieve URLs for a single host:
python waybackurls.py example.com
-
Retrieve URLs for multiple hosts:
python waybackurls.py example.com example.org
-
Include subdomains in the search:
python waybackurls.py -s example.com
-
Specify the output directory:
python waybackurls.py -o results/example example.com
-
Adjust the number of concurrent threads:
python waybackurls.py -t 10 example.com
The script saves the retrieved URLs as JSON files in the specified output directory. Each JSON file is named after the corresponding host, e.g., example.com-waybackurls.json
.
Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request.
This project is licensed under the MIT License.