Production-ready intelligent document search with Neo4j graph database + Azure AI Foundry
Transform your documents into an intelligent knowledge base combining Neo4j's graph database with retrieval-augmented generation. Currently deployed with 12 technical books (30,006 chunks, 25.9 GB indexed) on Neo4j Aura, ready for Azure AI Foundry integration.
- Stores documents in Neo4j Aura with vector embeddings (384-dim) and graph relationships
- Searches intelligently using hybrid vector + keyword search with 100% embedding coverage
- Generates answers through Azure AI Foundry Assistant (gpt-4o-mini) with custom RAG functions
- Scales flexibly from local development to enterprise Azure deployment
Production Aura Instance: 6b870b04 (westeurope)
- 12 PDF books: Neo4j, Graph Theory, RAG Systems, ML/GNN, Knowledge Graphs
- 30,006 text chunks: 100% embedded with SentenceTransformers (all-MiniLM-L6-v2)
- 25.9 GB indexed: Technical content from O'Reilly, Manning, arXiv, Neo4j Official
- Categories: Neo4j (59%), Graph Theory/ML (32%), RAG (5%), Knowledge Graphs (3%), Vector DBs (1%)
See: AURA_DATABASE_ANALYSIS_REPORT.md for complete analysis
Flexible deployment architecture supporting both local development and enterprise Azure production. The system uses the same codebase for both environments, enabling seamless transition from development to production.
Core Components:
- Neo4j Database: Graph storage with vector + keyword search (local Docker or Aura managed)
- RAG Service: FastAPI + Docling + SentenceTransformers for intelligent retrieval
- LLM: BitNet.cpp (local) or Azure AI Foundry Assistant (production)
- Streamlit UI: Interactive chat interface (local development only)
Complete containerized stack for local development with zero external dependencies. All components run on your machine with Docker for easy setup and teardown.
Use Cases: Development, testing, demos, sensitive data, offline environments, learning Benefits: Zero cloud costs, complete data sovereignty, no external dependencies, full control Performance: 417x faster search, 87% memory reduction vs traditional systems
graph LR
USER[π€ User<br/>Web Browser]
subgraph Docker["π³ Docker Containers"]
UI[π§ Streamlit UI<br/>Document Upload<br/>Q&A Interface]
RAG[β‘ RAG Service<br/>FastAPI + Docling<br/>SentenceTransformers]
DB[(ποΈ Neo4j Database<br/>Vector + Keyword Search<br/>417x Faster)]
BitNet[π€ BitNet.cpp<br/>1.58-bit LLM<br/>87% Memory Reduction]
end
USER -->|Interact| UI
UI -->|Query| RAG
RAG -->|Search| DB
DB -->|Context| RAG
RAG -->|Generate| BitNet
BitNet -->|Answer| UI
UI -->|Display| USER
classDef userClass fill:#e1f5ff,stroke:#01579b,stroke-width:2px
classDef containerClass fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
class USER userClass
class UI,RAG,DB,BitNet containerClass
Production deployment with Azure AI Foundry and Neo4j Aura. Managed services with automatic scaling, enterprise-grade AI, and zero infrastructure management overhead.
Current Production State:
- β
Neo4j Aura instance
6b870b04(westeurope) with 12 books, 30K chunks - β Azure AI Foundry Assistant configured with custom RAG functions
- β
Credentials secured in Azure Key Vault
kv-neo4j-rag-7048
Use Cases: Production apps, customer-facing services, enterprise knowledge bases, M365/Teams integration Benefits: Auto-scaling, managed infrastructure, enterprise AI (gpt-4o-mini), high availability Scalability: 0-10 replicas, 100+ concurrent users, serverless Container Apps
graph LR
subgraph Internet["π Internet"]
USER[π€ End Users<br/>API Clients]
end
subgraph Azure["βοΈ Azure Cloud"]
subgraph AIFoundry["π€ Azure AI Foundry"]
ASSISTANT["Azure AI Assistant<br/>ββββββββββββββ<br/>ID: asst_LHQBX...<br/>Model: gpt-4o-mini<br/>Deployment: 2025-08-07<br/>ββββββββββββββ<br/>Functions:<br/>β’ search_knowledge_base<br/>β’ add_document<br/>β’ get_statistics"]
end
subgraph CAE["Container Apps"]
RAG["β‘ RAG Service<br/>CPU: 2, RAM: 4GB<br/>Replicas: 0-10<br/>ββββββββββββββ<br/>FastAPI + Docling<br/>SentenceTransformers<br/>Connection Pool<br/>Query Cache"]
end
subgraph Data["ποΈ Data Layer"]
AURA[(Neo4j Aura<br/>Managed Database<br/>ββββββββββββββ<br/>Vector Index 384-dim<br/>Full-text Index<br/>Graph Relationships)]
end
end
%% Left to Right Flow
USER -->|HTTPS API| ASSISTANT
ASSISTANT -->|Function Calls| RAG
RAG <-->|Bolt 7687| AURA
classDef userClass fill:#e1f5ff,stroke:#01579b,stroke-width:2px
classDef assistantClass fill:#ffcccc,stroke:#cc0000,stroke-width:2px
classDef containerClass fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
classDef dbClass fill:#fce4ec,stroke:#c2185b,stroke-width:2px
class USER userClass
class ASSISTANT assistantClass
class RAG containerClass
class AURA dbClass
Your Aura database contains a comprehensive collection of technical books:
Neo4j & Graph Databases (5 books, 17,656 chunks):
- O'Reilly Graph Databases 2nd Edition
- Beginning Neo4j (Apress)
- Learning Neo4j eBook
- Graph Databases for Beginners
- Graph Databases 2nd Edition
Graph Theory & ML (4 books, 9,555 chunks):
- Deep Learning on Graphs (Yao Ma & Jiliang Tang)
- Graph Representation Learning (William Hamilton, McGill)
- RAG for LLMs: A Survey (arXiv 2312.10997)
- 5 Graph Data Science Basics
Specialized Topics (3 books, 2,795 chunks):
- O'Reilly: RAG in Production with Haystack
- Knowledge Graphs: Data in Context (Neo4j)
- Vector Database Management Systems (arXiv 2309.11322)
Explore with Cypher Queries:
- 45 Copy-Paste Queries - Complete Cypher guide with graph DB benefits
- Results Explained - Non-technical explanation of your data
- Browser Queries - Basic statistics and search
- Content Analysis - Advanced topic analysis
- Enhanced Queries - Visualization and utilities
Manage Knowledge Base:
# View statistics
python scripts/rag_statistics.py
# Upload more PDFs
python scripts/upload_pdfs_to_neo4j.py --target aura
# Run Cypher analysis
python scripts/run_cypher_analysis.py- Docker Desktop installed and running
- Python 3.11+
- 4GB+ RAM available
- x86_64 or ARM64 architecture
The Docker Compose configuration automatically sets up all four services (Neo4j, RAG, BitNet, Streamlit) with optimized memory settings, connection pooling, and intelligent caching. Everything runs locally on your machine with no external API calls or dependencies. Simply start the containers and access the Streamlit UI to begin chatting with your knowledge base immediately.
- Clone:
git clone https://github.com/ma3u/neo4j-agentframework.git - Start:
docker-compose -f scripts/docker-compose.optimized.yml up -d - Wait 1-2 minutes for all services to initialize
- Access Neo4j Browser: http://localhost:7474 (neo4j/password)
- Open Chat: http://localhost:8501
Interactive Mockup - Streamlit Chat UI with document upload and monitoring dashboard
What's Included:
- ποΈ Neo4j Database (ports 7474, 7687)
- β‘ RAG Service (port 8000)
- π€ BitNet LLM (port 8001) optional - local development only
- π§ Streamlit Chat UI (port 8501) local development only
Neo4j Database + RAG Service + BitNet LLM running in Docker Desktop
For testing against the production knowledge base with 12 books and 30K chunks.
# Configure credentials (already done if you followed setup)
# Credentials in: neo4j-rag-demo/.env
# Instance: 6b870b04 (westeurope)
# Test connection
cd neo4j-rag-demo
source venv_local/bin/activate
python scripts/rag_statistics.py
# Test search
python scripts/rag_search_examples.pyFor developers modifying RAG service code. Runs Neo4j in Docker, Python locally for debugging.
# Start local Neo4j
docker run -d --name neo4j-rag \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
neo4j:5.15-community
# Setup Python environment
cd neo4j-rag-demo
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Load sample data
python scripts/load_sample_data.py# Health check
curl http://localhost:8000/health
# Test RAG query
curl -X POST "http://localhost:8000/query" \
-H "Content-Type: application/json" \
-d '{"question": "What is BitNet?", "max_results": 3}'
# Get system statistics
curl http://localhost:8000/statsAfter starting the services, multiple web interfaces become available for different purposes: Streamlit for end-user chat interactions, RAG API for programmatic access, Neo4j Browser for database inspection and Cypher queries, and Grafana for performance monitoring. Each interface serves a specific role in development, testing, and operation of the knowledge base system.
- π§ Streamlit Chat UI: http://localhost:8501 (in development) [NEW!]
- RAG API Documentation: http://localhost:8000/docs
- Neo4j Browser: http://localhost:7474 (neo4j/password)
- Monitoring Dashboard: http://localhost:3000 (admin/optimized-rag-2024)
π± Streamlit Chat UI: Full-featured chat interface with document upload, monitoring dashboard, and real-time RAG responses. See Streamlit App Documentation for details.
Neo4j Browser with sample data and Cypher query interface
Core Operations: POST /query, POST /add-documents, GET /health, GET /stats
Documentation: See http://localhost:8000/docs for interactive API documentation
| Component | Traditional | This System | Improvement |
|---|---|---|---|
| Vector Search | Pinecone/Weaviate | Neo4j | 417x faster retrieval |
| Embeddings | OpenAI API ($50/mo) | SentenceTransformers | $50/month savings |
| LLM Memory | 8-16GB RAM | BitNet 1.5GB | 87% memory reduction |
| Deployment | Cloud-only | Local + Cloud | Complete flexibility |
| Image | Size | Description | Registry |
|---|---|---|---|
| bitnet-minimal | 334MB | Ultra-efficient, external model | ghcr.io/ma3u/ms-agentf-neo4j/bitnet-minimal |
| bitnet-optimized | 2.5GB | Balanced, embedded model | ghcr.io/ma3u/ms-agentf-neo4j/bitnet-optimized |
| bitnet-final | 3.2GB | Complete, all features | ghcr.io/ma3u/ms-agentf-neo4j/bitnet-final |
| rag-service | 2.76GB | High-performance RAG pipeline | ghcr.io/ma3u/ms-agentf-neo4j/rag-service |
Flexible configuration through environment variables for Neo4j connection, embedding models, and BitNet optimization settings. Docker Compose profiles enable different deployment modes: basic system for development, monitoring profile with Grafana/Prometheus for performance analysis, and testing profile for load testing. All settings are documented with sensible defaults that work out of the box.
Environment: NEO4J_URI, NEO4J_PASSWORD, EMBEDDING_MODEL, BITNET_MODE | Guide: Configuration
Docker Profiles: Basic: docker-compose up -d | Monitoring: --profile monitoring | Testing: --profile testing
Your knowledge base is deployed and operational on Neo4j Aura with a comprehensive collection of 12 technical books. The database contains 30,006 embedded chunks covering Neo4j, Graph Databases, RAG systems, Machine Learning on Graphs, Knowledge Graphs, and Vector Databases. Azure AI Foundry Assistant is configured with custom functions to search this knowledge base and provide intelligent, grounded responses.
Aura Instance: 6b870b04 (westeurope) | Books: 12 PDFs | Chunks: 30,006 | Coverage: 100%
See AURA_DATABASE_ANALYSIS_REPORT.md for detailed analysis.
The automated deployment script creates all necessary Azure resources including Container Apps for the RAG service, Key Vault for secrets, Application Insights for monitoring, and configures networking between components. The entire process takes about 30 minutes and sets up a production-ready environment with auto-scaling and managed identity authentication to your existing Aura knowledge base.
Run ./scripts/azure-deploy-enterprise.sh to deploy RAG Container App
After deploying the RAG service, your Azure AI Foundry agent is already configured with custom tools for searching the knowledge base with 30K chunks, uploading new documents, and retrieving statistics. The AI agent leverages Neo4j's high-performance vector search for instant, grounded responses from your comprehensive technical library.
Your Assistant is ready: python scripts/configure-azure-assistant.py (if updates needed)
Azure AI Agent configured with Neo4j RAG functions
Your Assistant:
- ID:
asst_LHQBXYvRhnbFo7KQ7IRbVXRR - Model: gpt-4o-mini
- Tools: 4 Neo4j RAG functions
- Knowledge Base: 12 books, 30,006 chunks
Test in playground: Ask "What is Neo4j?" and verify it searches the knowledge base.
π Integration Guides:
| Guide | Description | Status |
|---|---|---|
| Python SDK Integration | Complete guide using Azure AI Projects SDK with code examples | β Ready |
| OpenAPI Configuration | Upload OpenAPI spec for function calling | β Ready |
| Test Results | 20 comprehensive tests, 90% pass rate | β Validated |
Quick Test (Python SDK):
# Install SDK
pip install azure-ai-projects azure-identity
# Test your assistant
export AZURE_AI_PROJECT_ENDPOINT="https://YOUR_PROJECT.api.azureml.ms"
python scripts/test_azure_ai_foundry.pyTest Validation: β 18/20 tests passed (90% success rate), 310x cache speedup measured
See Azure Architecture for deployment architecture
Essential guides to get you up and running quickly, from complete developer journey to specific testing procedures. Each guide is self-contained with prerequisites, step-by-step instructions, and troubleshooting sections. Start with the Quick Start Guide for the fastest path to a working system.
| Document | Description |
|---|---|
| Quick Start Guide | Complete developer journey (local β Azure) |
| Streamlit Chat UI | Interactive chat interface documentation [under development] |
| Local Testing Guide | Comprehensive testing procedures |
| AG Testing Guide | RAG-specific testing procedures |
| User Guide | End-user documentation |
Detailed guides for deploying to Azure Container Apps with comprehensive coverage of architecture decisions, security configuration, and operational procedures. Includes both automated deployment scripts and manual step-by-step instructions. Each guide explains cost considerations, scaling strategies, and monitoring setup for production environments.
| Document | Description |
|---|---|
| Azure Deployment Guide | Detailed Azure deployment steps |
| Azure Architecture | Azure architecture documentation |
| Basic Deployment | Quick deployment reference |
| BitNet Deployment | BitNet-specific deployment |
Complete guides for integrating Neo4j RAG with Azure AI Foundry, including Python SDK usage, OpenAPI configuration, and comprehensive test validation with 90% pass rate.
| Document | Description | Status |
|---|---|---|
| Python SDK Integration Guide | Complete Azure AI Projects SDK integration with working code examples | β Ready |
| OpenAPI Setup Instructions | Step-by-step guide to upload OpenAPI spec and configure functions | β Ready |
| Configuration Guide | Detailed configuration with troubleshooting and demo scripts | β Ready |
| Test Results & Validation | 20 comprehensive tests, 90% pass rate, 310x cache speedup proven | β Validated |
| Complete Summary | Full Issue #4 implementation with all deliverables | β Complete |
Quick Start: pip install azure-ai-projects azure-identity then python scripts/test_azure_ai_foundry.py
Deep technical documentation covering system architecture, performance optimization strategies, and component integration details. Includes 17 Mermaid diagrams visualizing system flows, embedding model comparisons, and the complete BitNet build journey with lessons learned. Essential reading for understanding implementation decisions and optimization techniques.
| Document | Description |
|---|---|
| System Architecture | Complete architecture with 17 Mermaid diagrams |
| Embeddings Guide | Embedding models (all-MiniLM-L6-v2 vs Azure OpenAI) |
| BitNet Success Story | BitNet build journey & lessons learned |
| LLM Setup Guide | LLM configuration and setup |
| Performance Analysis | Detailed benchmarks & metrics |
| Document | Description |
|---|---|
| Neo4j Browser Guide | Neo4j Browser setup and usage |
| Knowledge Base Setup | Knowledge base download and configuration |
| Browser Setup Guides | Detailed browser configuration |
| Document | Description |
|---|---|
| Implementation Status | Current features & progress |
| Next Steps & Roadmap | Future improvements |
| Document | Description |
|---|---|
| Contributing Guide | How to contribute |
| Security Policy | Security guidelines & reporting |
| Claude Code Guide | AI assistant guidance |
| Document | Description |
|---|---|
| Archive Documentation | Historical references & summaries |
| Cost Optimization | Azure cost optimization strategies |
- API Documentation - Interactive API docs (when running locally)
- GitHub Repository - Source code & issues
- Release Notes - Version history
Contributions welcome through pull requests following the standard GitHub workflow. Fork the repository, create a feature branch, make changes with tests, and submit a PR for review.
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Documentation: Wiki
- Discussions: GitHub Discussions