A lightning-fast FastAPI for advanced agentic RAG (Retrieval-Augmented Generation) with Ollama and Qdrant.
- Hybrid Search: Combines dense (semantic) and sparse (keyword) search
- Advanced Reranking: Multiple reranking strategies for improved results
- Project Management: Organize documents into projects with sharing capabilities
- Multiple Query Types: Raw queries, LLM-powered queries, streaming, and custom prompts
- Document Processing: Support for PDF, TXT, MD, DOCX, and URL ingestion
- Production Ready: Comprehensive testing, migrations, and safety features
# Clone the repository
git clone <repository>
cd zero-query-ai
# Start development environment with hot reload (default)
./start.sh
# Or explicitly specify development mode
./start.sh dev
# Access the application
# - API: http://localhost:8000/docs
# - Frontend: http://localhost:3100
# Clone the repository
git clone <repository>
cd zero-query-ai
# Start production environment
./start.sh prod
# Access the application
# - API: http://localhost:8000/docs
# - Frontend: http://localhost:3100
- Clone and setup:
git clone <repository>
cd zero-query-ai
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
- Configure environment:
cp env.example .env
# Edit .env with your database and Ollama settings
- Start services:
# Start PostgreSQL and Qdrant
docker-compose up -d
# Start the API
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
- Test the API:
# Check health
curl http://localhost:8000/health
# Upload a document
curl -X POST "http://localhost:8000/ingest" \
-F "file=@your_document.pdf"
# Query documents
curl -X POST "http://localhost:8000/query" \
-H "Content-Type: application/json" \
-d '{"query": "What is machine learning?"}'
- Development Setup: See
DEVELOPMENT.md
for detailed development instructions - Hot Reload: Automatic restart on code changes
- Volume Mounting: Live editing with instant updates
- Interactive Docs: Visit
http://localhost:8000/docs
for Swagger UI - ReDoc: Visit
http://localhost:8000/redoc
for alternative documentation
- cURL Testing: See
tests/TESTING.md
for comprehensive API testing examples - Test Suite: Run
python tests/run_all.py
for automated testing - Seed Data: Use
python seed/seed_runner.py
to populate with sample data
├── app/ # Main application code
│ ├── main.py # FastAPI application and endpoints
│ ├── models.py # Pydantic models and schemas
│ ├── database.py # Database models and connection
│ ├── document_processor.py # Document processing logic
│ ├── embedding_service.py # Vector search and embeddings
│ ├── llm_service.py # LLM integration with Ollama
│ ├── reranker_service.py # Document reranking
│ └── config.py # Configuration settings
├── tests/ # Test suite and utilities
│ ├── test_suite.py # Comprehensive test suite
│ ├── run_all.py # Test runner with safety checks
│ └── TESTING.md # Manual API testing guide
├── seed/ # Seed data and utilities
│ ├── seed_runner.py # Seed data runner
│ └── files/ # Sample documents
└── uploads/ # Document upload directory
GET /
- API informationGET /health
- System health checkPOST /ingest
- Upload and process documentsPOST /ingest/url
- Ingest documents from URLsGET /documents
- List all documentsDELETE /documents/{id}
- Delete documents
POST /query
- Advanced RAG queries with LLMPOST /query/raw
- Raw search without LLMPOST /query/stream
- Streaming responsesPOST /search/hybrid
- Hybrid search endpointPOST /rerank
- Document reranking
POST /projects
- Create projectsGET /projects
- List projectsDELETE /projects/{id}
- Delete projectsGET /projects/{id}/documents
- List project documentsPOST /projects/{id}/search
- Project-specific searchPOST /projects/{id}/documents/{doc_id}/share
- Share documents
Semantic search using embeddings for understanding meaning and context.
Keyword-based search using TF-IDF for exact term matching.
Combines dense and sparse search with configurable weights (alpha parameter).
Direct search results without LLM processing, useful for custom pipelines.
Full RAG with LLM-generated responses based on retrieved context.
Real-time responses with progressive LLM output generation.
User-defined prompts for specialized use cases.
No reranking applied, returns original search order.
Basic relevance scoring based on query-document similarity.
Sophisticated semantic reranking with context understanding.
# Database
DATABASE_URL=postgresql://user:password@localhost:5432/rag_db
# Qdrant
QDRANT_URL=http://localhost:6333
QDRANT_COLLECTION=documents
# Ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama2
# File Upload
MAX_FILE_SIZE=10485760 # 10MB
UPLOAD_DIR=uploads
# Security
SECRET_KEY=your-secret-key
# Run all tests with safety checks
python tests/run_all.py
# Run specific test suite
python -m pytest tests/test_suite.py -v
# Run migrations automatically on startup
# Or manually:
python -c "from app.migrations import run_migrations; run_migrations()"
# Populate with sample data
python seed/seed_runner.py
# Build and run with Docker Compose
docker-compose up -d
# Or build custom image
docker build -t rag-api .
docker run -p 8000:8000 rag-api
The API includes safety features to prevent accidental production data deletion:
- Environment checks in test runners
- Confirmation prompts for destructive operations
- Automatic cleanup of test data
We welcome contributions! Please see our CONTRIBUTING.md for detailed guidelines on how to contribute to this project.
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Zero Query AI is an open source project. We believe in the power of community-driven development and welcome contributions from developers worldwide. This project serves as both a learning resource and a foundation for building advanced RAG applications.
- Community-driven innovation: Harness the collective intelligence of the developer community
- Transparency: Open development process builds trust and credibility
- Learning resource: Help others learn about advanced RAG techniques
- Ecosystem growth: Contribute to the broader AI/ML open source ecosystem