Phishing Detection Tool
A comprehensive Python-based tool for detecting phishing attempts in URLs, emails, and domains. This tool uses multiple detection techniques to identify potential phishing attacks and provide actionable recommendations.
- Suspicious Pattern Detection: Identifies keywords commonly used in phishing URLs
- Homograph Attack Detection: Detects lookalike characters (e.g., 'paypaI' vs 'paypal')
- Domain Age Analysis: Flags recently registered domains
- SSL Certificate Validation: Checks for HTTPS usage
- URL Shortener Detection: Identifies and expands shortened URLs
- IP Address Detection: Flags direct IP usage instead of domain names
- Path Complexity Analysis: Detects unnecessarily complex URL structures
- Header Authentication: Validates SPF, DKIM, and DMARC records
- Sender Verification: Compares From and Reply-To addresses
- URL Extraction: Automatically scans email body for malicious links
- Subject Line Analysis: Detects urgent/threatening language patterns
- Similarity Detection: Compares domains against commonly spoofed websites
- Levenshtein Distance: Calculates mathematical similarity between domains
- Visual Similarity: Identifies domains that look similar to legitimate ones
- Process up to 50 URLs simultaneously
- Generate summary reports with risk distribution
- Export results for further analysis
-
Clone the repository bash git clone https://github.com/yourusername/phishing-detection-tool.git cd phishing-detection-tool
-
Create a virtual environment bash python -m venv venv
venv\Scripts\activate
source venv/bin/activate
-
Install dependencies bash pip install -r requirements.txt
-
Create required directories bash mkdir templates
-
Move the HTML file to templates directory bash
phishing-detection-tool/ โ โโโ phishing_detector.py # Core detection engine โโโ app.py # Flask web application โโโ requirements.txt # Python dependencies โโโ templates/ โ โโโ index.html # Web interface โโโ README.md # This file
-
Start the Flask application bash python app.py
-
Open your browser Navigate to http://localhost:5000
-
Choose a detection method
- URL Scanner: Analyze individual URLs
- Email Analyzer: Check email headers and content
- Domain Checker: Verify domain legitimacy
- Batch Analysis: Process multiple URLs
python from phishing_detector import PhishingDetector
detector = PhishingDetector()
result = detector.analyze_url("http://amaz0n.com/verify-account") print(f"Risk Level: {result['risk_level']}") print(f"Risk Score: {result['risk_score']}") print(f"Risk Factors: {result['risk_factors']}")
similar = detector.check_domain_similarity("amaz0n.com") for match in similar: print(f"Similar to {match['legitimate_domain']}: {match['similarity_score']}")
headers = { "From": "[email protected]", "Reply-To": "[email protected]", "Subject": "Urgent: Verify your account" } email_result = detector.analyze_email_headers(headers) print(f"Email Risk: {email_result['risk_level']}")
Analyzes a single URL for phishing indicators.
Request: json { "url": "https://example.com" }
Response: json { "url": "https://example.com", "risk_level": "LOW", "risk_score": 0.15, "risk_factors": ["Not using HTTPS"], "recommendations": ["Always use HTTPS for sensitive data"] }
Analyzes email headers and body content.
Request: json { "headers": { "From": "[email protected]", "Subject": "Test Email" }, "body": "Email content with https://link.com" }
Checks domain similarity to known legitimate sites.
Request: json { "domain": "amaz0n.com" }
Analyzes multiple URLs in a single request.
Request: json { "urls": [ "https://example1.com", "https://example2.com" ] }
The tool uses a weighted scoring system to calculate risk:
Factor | Weight | Description |
---|---|---|
URL Length | 0.10 | Unusually long URLs (>75 chars) |
Suspicious Keywords | 0.15 | Contains phishing-related words |
Subdomain Count | 0.10 | Excessive subdomains (>2) |
HTTPS Missing | 0.20 | Not using secure protocol |
IP Address | 0.25 | Uses IP instead of domain |
URL Shortener | 0.15 | Uses URL shortening service |
Homograph Attack | 0.20 | Contains lookalike characters |
Recent Domain | 0.15 | Registered within 30 days |
Suspicious TLD | 0.10 | Uses high-risk TLDs |
Path Complexity | 0.05 | Complex URL structure |
- LOW (0-20%): Minimal risk indicators
- MEDIUM (20-50%): Some suspicious characteristics
- HIGH (50-80%): Multiple risk factors present
- CRITICAL (80-100%): Strong phishing indicators
- Privacy: This tool does not store or log analyzed URLs or emails
- Network Requests: URL expansion and domain lookups require internet access
- Rate Limiting: Implement rate limiting in production to prevent abuse
- HTTPS: Always use HTTPS when deploying the web interface
- Misspelled domain names (e.g., "arnazon" instead of "amazon")
- Excessive subdomains (e.g., "amazon.security.account-verify.com")
- IP addresses instead of domain names
- Suspicious TLDs (.tk, .ml, .ga)
- URL shorteners hiding the real destination
- Urgent language ("Act now!", "Account will be suspended")
- Generic greetings ("Dear customer" instead of your name)
- Mismatched sender addresses
- Poor grammar or spelling
- Requests for sensitive information
Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.
- Machine learning model for improved detection
- Browser extension for real-time protection
- Integration with threat intelligence feeds
- Mobile app development
- API rate limiting and authentication
- Database for storing analysis history
This project is licensed under the MIT License. Use responsibly and ethically.
This tool is for educational and defensive purposes only. Always verify suspicious communications through official channels. The tool provides risk assessments but cannot guarantee 100% accuracy in detecting all phishing attempts.
This tool should only be used for:
- Personal protection against phishing
- Educational purposes
- Security awareness training
- Authorized security assessments
Never use this tool to:
- Create phishing campaigns
- Bypass security measures
- Conduct unauthorized testing
For questions or support, please open an issue on GitHub or contact the maintainers.