🧭 Kairon

A practical vector database from first principles. Built for clarity, correctness, and performance

Kairon — derived from Kairos (καιρός).

Kairon is a production-quality vector database designed from first principles, implementing multiple indexing strategies (HNSW, KD-Tree, IVF) with deterministic testing, reproducibility, and practical features like metadata filtering and hybrid search.

Quick Start

from kairon import HNSWIndex
import numpy as np

# Generate some vectors
points = np.random.randn(1000, 128).astype(np.float32)

# Build index
index = HNSWIndex.build(points, M=16, ef_construct=200)

# Save index
index.save('models/hnsw.idx')

# Search
query = np.random.randn(128).astype(np.float32)
results = index.search(query, k=10, ef_search=100)
print(results)  # List of (id, distance) tuples

Features

Multiple Index Types: HNSW, KD-Tree, and IVF (Inverted File Index)
Metadata Filtering: Filter results by metadata predicates
Hybrid Search: Combine vector similarity with metadata scores
Persistence: Save and load indices with versioned format
Incremental Updates: Add vectors to indices (delete via metadata tombstoning planned)
Deterministic: All operations use seeded RNG for reproducibility
Benchmarks: Comprehensive benchmark harness with recall/QPS metrics

Repository Structure

/kairon/
  src/
    kairon/            # Main package
  tests/               # Test suite
  examples/            # Usage examples
  bench/               # Benchmark scripts and datasets
  docs/                # Design documentation

Installation

pip install -e .

Index Types

HNSW (Hierarchical Navigable Small World)

Fast approximate nearest neighbor search using multi-layer graphs. Recommended for high-dimensional data.

index = HNSWIndex.build(points, M=16, ef_construct=200)
results = index.search(query, k=10, ef_search=100)

KD-Tree

Balanced binary tree with dimension-based splits. Good for low-dimensional data (< 10 dimensions).

from kairon import KDIndex

index = KDIndex.build(points, leaf_size=10)
results = index.search(query, k=10)

IVF (Inverted File Index)

Coarse quantization with inverted lists. Efficient for very large datasets.

from kairon import IVFIndex

index = IVFIndex.build(points, nlist=100, nprobe=10)
results = index.search(query, k=10, nprobe=10)

Metadata and Hybrid Search

# Add metadata to vectors
metadata = [
    {"severity": "P1", "service": "auth", "timestamp": 1680000000},
    {"severity": "P2", "service": "api", "timestamp": 1680001000},
    # ...
]
index.add_metadata(metadata)

# Filter by metadata
filters = {"severity": {"eq": "P1"}, "service": {"eq": "auth"}}
results = index.search(query, k=10, filters=filters)

# Hybrid search: combine vector similarity + metadata score
results = index.search(query, k=10, hybrid_weight=0.7)

Benchmarks

Run benchmarks to evaluate performance:

cd bench
python run_bench.py --dataset synthetic --n 10000 --dim 128

This produces bench_results.json and visualization PNGs in bench/output/.

Testing

pytest tests/

All tests use deterministic seeds for reproducibility.

Documentation

DESIGN.md - Architecture and design decisions
TUNING.md - Tuning guide for production workloads

License

MIT

References

Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. IEEE transactions on pattern analysis and machine intelligence.
Jégou, H., Douze, M., & Schmid, C. (2010). Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
bench		bench
docs		docs
examples		examples
src/kairon		src/kairon
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧭 Kairon

Quick Start

Features

Repository Structure

Installation

Index Types

HNSW (Hierarchical Navigable Small World)

KD-Tree

IVF (Inverted File Index)

Metadata and Hybrid Search

Benchmarks

Testing

Documentation

License

References

About

Uh oh!

Releases

Packages

Languages

License

salma2vec/kairon

Folders and files

Latest commit

History

Repository files navigation

🧭 Kairon

Quick Start

Features

Repository Structure

Installation

Index Types

HNSW (Hierarchical Navigable Small World)

KD-Tree

IVF (Inverted File Index)

Metadata and Hybrid Search

Benchmarks

Testing

Documentation

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages