Lightweight, open source, locally-hosted Modern Data Stack
- Extract and Load: Polars and dlt
- Data Quality: Pandera
- Storage: DuckDB
- Transformation: dbt
- Orchestration: Prefect
- Visualization: Dash
Prerequisites: Install git and uv.
Clone repository and change directory:
git clone https://github.com/esadek/mini-mds.git
cd mini-mds
Extract, validate, load and transform data:
uv run prefect/elt.py
Visualize data:
uv run dash/app.py
flowchart LR
A(CSV) --> B[Polars]
subgraph Prefect
B --> C[Pandera]
C --> D[dlt]
E[dbt Core]
end
D --> F[(DuckDB)]
E <--> F
F --> G[Dash]
mini-mds
├── .github/ # GitHub workflows
├── dash/ # Dash application
├── dbt/ # dbt project
├── duckdb/ # DuckDB warehouse
├── prefect/ # Prefect workflows
├── .editorconfig # Editor configuration
├── .gitignore # Untracked files to ignore
├── .python-version # Default Python version
├── LICENSE # MIT license
├── pyproject.toml # Project metadata
├── README.md # Documentation
└── uv.lock # Dependency lockfile