A comprehensive collection of Python scripts focusing on Web Automation, Web Scraping, and Cryptography implementations. This repository serves as a resource for developers interested in automation, data extraction, and security.
Advanced web automation scripts using Selenium WebDriver for various platforms and applications.
- Email Automation: Automated email tasks and inbox management
- Social Media Automation:
- Facebook interactions
- WhatsApp messaging
- Telegram bot functionality
- YouTube automation
- Task Management:
- Trello board automation
- Project workflow automation
- Locator Strategies:
- XPath implementations
- CSS Selector examples
- Practical automation templates
Collection of Python web scraping scripts for automated data extraction from various websites.
- Specialized Scrapers:
- Books information extractor
- Carrefour website product scraper
- IMDB movie/TV show data collector
- Indeed job listings extractor
- General HTML scraper
- API data extractor
- Data Extraction Capabilities:
- Product information
- Pricing details
- Description extraction
- Rating systems
- User reviews
Implementation of various cryptographic algorithms and security concepts.
- Types of Ciphers
- Public Key Cryptography
- Hash Functions
- Message Authentication Codes (MAC, CMAC, HMAC)
- Digital Certificates
- Authentication Systems
- TCP/IP Security
- Internet Security Protocols
- Firewall Implementation
- Intrusion Detection System
- Network Security Concepts
- Python 3.x
- Git
pip install selenium webdriver_manager
pip install requests beautifulsoup4
pip install pycryptodome
- Clone the repository
git clone https://github.com/mimi-netizen/Python-Scripts.git
- Navigate to the project directory
cd Python-Scripts
- Install required dependencies
pip install -r requirements.txt
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
# Setup Chrome WebDriver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
import requests
from bs4 import BeautifulSoup
# Basic scraping template
response = requests.get('https://example.com')
soup = BeautifulSoup(response.text, 'html.parser')
# Example cipher implementation
def caesar_cipher(text, shift, encrypt=True):
result = ""
for char in text:
if char.isalpha():
ascii_offset = ord('A') if char.isupper() else ord('a')
shift_value = shift if encrypt else -shift
result += chr((ord(char) - ascii_offset + shift_value) % 26 + ascii_offset)
else:
result += char
return result
All Selenium projects create logs in logs/selenium/
:
tail -f logs/selenium/automation.log
Monitor scraping progress:
tail -f logs/scrapers/scraping_progress.log
Rate limiting and error logs:
cat logs/scrapers/rate_limits.log
cat logs/scrapers/errors.log
Operation logs and performance metrics:
cat logs/crypto/operations.log
cat logs/crypto/performance_metrics.log
- Enable verbose logging:
import logging
logging.basicConfig(level=logging.DEBUG)
- Use Chrome DevTools:
chrome_options.add_argument('--auto-open-devtools-for-tabs')
- Test selectors in browser console
- Use
scrapy shell
for interactive debugging:
scrapy shell "http://example.com"
- Enable verbose mode:
python crypto_script.py --verbose
- Use test vectors in
tests/vectors/
directory
import logging
# Configure detailed logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('debug.log'),
logging.StreamHandler()
]
)
# Create a custom logger
logger = logging.getLogger('selenium_automation')
def capture_error_screenshot(driver, error_name):
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
screenshot_path = f"debug/screenshots/{error_name}_{timestamp}.png"
driver.save_screenshot(screenshot_path)
logger.debug(f"Screenshot saved: {screenshot_path}")
# Usage in try-except block
try:
element = driver.find_element(By.ID, "submit-button")
element.click()
except Exception as e:
capture_error_screenshot(driver, "click_error")
logger.error(f"Click failed: {str(e)}")
def debug_element(driver, locator, by=By.CSS_SELECTOR):
try:
element = driver.find_element(by, locator)
debug_info = {
"is_displayed": element.is_displayed(),
"is_enabled": element.is_enabled(),
"is_selected": element.is_selected(),
"location": element.location,
"size": element.size,
"class": element.get_attribute("class"),
"computed_style": driver.execute_script(
"return window.getComputedStyle(arguments[0]);",
element
)
}
logger.debug(f"Element debug info: {debug_info}")
return debug_info
except Exception as e:
logger.error(f"Element debugging failed: {str(e)}")
from selenium.webdriver.chrome.options import Options
def setup_network_debugging():
chrome_options = Options()
chrome_options.set_capability(
"goog:loggingPrefs",
{"performance": "ALL", "browser": "ALL"}
)
return chrome_options
def analyze_network_logs(driver):
logs = driver.get_log("performance")
for log in logs:
network_log = json.loads(log["message"])["message"]
if "Network.responseReceived" in network_log["method"]:
logger.debug(f"Network response: {network_log}")
import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
def debug_request(url, headers=None):
session = requests.Session()
retries = Retry(
total=3,
backoff_factor=0.5,
status_forcelist=[500, 502, 503, 504]
)
session.mount('http://', HTTPAdapter(max_retries=retries))
session.mount('https://', HTTPAdapter(max_retries=retries))
try:
response = session.get(url, headers=headers)
debug_info = {
"status_code": response.status_code,
"headers": dict(response.headers),
"cookies": dict(response.cookies),
"encoding": response.encoding,
"redirect_history": [
{"url": r.url, "status_code": r.status_code}
for r in response.history
]
}
logger.debug(f"Request debug info: {debug_info}")
return debug_info
except Exception as e:
logger.error(f"Request failed: {str(e)}")
from bs4 import BeautifulSoup
import lxml.html
def debug_parsing(html_content):
# BeautifulSoup parsing
soup = BeautifulSoup(html_content, 'html.parser')
# Compare with lxml parsing
lxml_doc = lxml.html.fromstring(html_content)
debug_info = {
"soup_title": soup.title.string if soup.title else None,
"lxml_title": lxml_doc.find(".//title").text if lxml_doc.find(".//title") is not None else None,
"soup_encoding": soup.original_encoding,
"tags_count": len(soup.find_all()),
"broken_tags": soup.find_all(lambda tag: tag.name is None)
}
logger.debug(f"Parsing debug info: {debug_info}")
return debug_info
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa
def debug_key_generation(key_size=2048):
try:
start_time = time.time()
private_key = rsa.generate_private_key(
public_exponent=65537,
key_size=key_size
)
generation_time = time.time() - start_time
# Analyze key properties
debug_info = {
"key_size": key_size,
"generation_time": generation_time,
"public_numbers": {
"n": private_key.public_key().public_numbers().n,
"e": private_key.public_key().public_numbers().e
},
"private_numbers": {
"p": private_key.private_numbers().p,
"q": private_key.private_numbers().q,
"d": private_key.private_numbers().d
}
}
logger.debug(f"Key generation debug info: {debug_info}")
return debug_info
except Exception as e:
logger.error(f"Key generation failed: {str(e)}")
def debug_encryption_process(data, key):
try:
debug_info = {
"input_data": {
"length": len(data),
"encoding": chardet.detect(data)
},
"key_info": {
"algorithm": key.algorithm.name,
"key_size": key.key_size
}
}
# Track encryption steps
padded_data = pad(data)
debug_info["padding"] = {
"original_length": len(data),
"padded_length": len(padded_data)
}
# Monitor encryption performance
start_time = time.time()
encrypted_data = encrypt(padded_data, key)
debug_info["encryption_time"] = time.time() - start_time
logger.debug(f"Encryption debug info: {debug_info}")
return encrypted_data, debug_info
except Exception as e:
logger.error(f"Encryption failed: {str(e)}")
import psutil
import os
def monitor_memory_usage():
process = psutil.Process(os.getpid())
memory_info = {
"rss": process.memory_info().rss / 1024 / 1024, # MB
"vms": process.memory_info().vms / 1024 / 1024, # MB
"percent": process.memory_percent(),
"cpu_percent": process.cpu_percent()
}
logger.debug(f"Memory usage: {memory_info}")
return memory_info
# Usage example
def process_large_file(file_path):
initial_memory = monitor_memory_usage()
try:
with open(file_path, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b''):
process_chunk(chunk)
current_memory = monitor_memory_usage()
if current_memory["percent"] > 90:
logger.warning("High memory usage detected!")
except Exception as e:
logger.error(f"File processing failed: {str(e)}")
final_memory = monitor_memory_usage()
logger.debug(f"Memory change: {final_memory['rss'] - initial_memory['rss']} MB")
- ChromeDriver version mismatch: Update using
webdriver_manager
- Element not found: Implement explicit waits
- Stale elements: Refresh page or relocate element
- Rate limiting: Implement delays between requests
- IP blocks: Rotate proxies using
proxy_rotator.py
- Dynamic content: Use Selenium for JavaScript-heavy sites
- Key generation errors: Check system entropy
- Performance issues: Use appropriate key sizes
- Memory errors: Implement streaming for large files
Each project contains its own detailed documentation:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Celyne Kydd - [email protected]
Project Link: https://github.com/mimi-netizen/Python-Scripts
- Selenium Documentation
- BeautifulSoup Documentation
- Cryptography Libraries and Resources
- Python Community