🛡️ Threat Intelligence Pipeline (Work In Progress)

An evolving Kafka-based pipeline for ingesting and enriching cyber threat indicators, orchestrated using Airflow and containerized infrastructure. This project is being developed to explore scalable threat intel workflows and showcase hands-on data engineering capabilities.

⚠️ Work In Progress

This repository is currently under active development. Some components are outlined but not yet functional, and full setup instructions will be added as implementation proceeds. The repo serves as both a technical playground and a conceptual showcase of modern pipeline architecture.

🧱 Planned Architecture Overview

This project is structured around a modular flow designed to simulate real-world threat ingestion and analysis.

1. Ingestion

✅ ThreatFox: API-based feed of fresh IPs, domains, hashes linked to active malware.
🔜 AbuseIPDB: Recently reported malicious IPs.
🔜 PhishTank (via OTX Pulse): Confirmed phishing URLs across industry targets.
⏱️ Scheduled Python scripts and Airflow DAGs (every 10–30 mins) to simulate a near-real-time feed.

2. Enrichment

🔜 IPinfo.io: Geo-tagging and ASN data for IP indicators.
⏳ (Optional) VirusTotal / URLScan.io: Enrichment metadata and detection scores (within API limits).

3. Transformation (dbt)

🔜 Normalize into structured models:
- stg_raw_iocs — base extraction
- dim_ips, dim_domains, dim_hashes — dimension tables
- fct_threat_events — enriched and timestamped threat data
📊 Build insights such as:
- IOC distribution by country, type, source
- Recurring IPs and campaign freshness timelines

4. Output & Analysis (Optional)

🔜 Jupyter / Streamlit App:
- Visual threat timelines
- IP heatmaps
- Top malicious infrastructures by ASN or registrar

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ Threat Intelligence Pipeline (Work In Progress)

⚠️ Work In Progress

🧱 Planned Architecture Overview

1. Ingestion

2. Enrichment

3. Transformation (dbt)

4. Output & Analysis (Optional)

About

Uh oh!

Releases

Packages

License

pduebel/threat-intel-pipeline

Folders and files

Latest commit

History

Repository files navigation

🛡️ Threat Intelligence Pipeline (Work In Progress)

⚠️ Work In Progress

🧱 Planned Architecture Overview

1. Ingestion

2. Enrichment

3. Transformation (dbt)

4. Output & Analysis (Optional)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages