Welcome to the comprehensive guide for implementing RAG systems! This repository provides a structured approach to building and optimizing Retrieval Augmented Generation systems, from basic implementations to advanced techniques.
- RAG from Scratch
- Complete implementation guide from ground up
- RAG in 10 lines of code
- Understanding embeddings and similarity
- Basic requirements setup
-
- Basic server implementation
- Jupyter notebook tutorials
- Performance evaluation notebooks
- Environment setup guide
-
- BM25 algorithm implementation
- Application setup
- Interactive notebook examples
-
- Data chunking strategies
- Embedding generation
- Batch processing examples
- Data parsing techniques
-
- RAGAS metrics implementation
- Deepeval integration
- TruLens evaluation
- Test dataset examples
-
- System monitoring setup
- Performance tracking
- Debug tools integration
-
- Result re-ranking implementation
- Evaluation metrics
- Performance optimization
-
- Qdrant hybrid search implementation
- Multiple retrieval method integration
-
- Context window optimization
- Sentence-level retrieval
-
- Automatic content merging
- Redundancy elimination
-
- HyDE (Hypothetical Document Embeddings)
- Query transformation techniques
- Query optimization strategies
-
- Self-querying mechanisms
- Query refinement techniques
-
- Multiple RAG model integration
- Result fusion strategies
-
- Advanced reasoning implementation
- Performance optimization
-
- ColBERT model integration
- Ragatouille retriever implementation
-
- Graph-based retrieval
- Knowledge graph integration
-
- Multi-document agent system
- Domain-specific implementations
-
- GPT-4V integration
- Multi-modal retrieval implementation
Located in the data/
directory:
- Markdown Documents (
md/
): Processed markdown versions of papers - PDF Documents (
pdf/
): Original research papers and documentation - Sample Database (
sample-lancedb/
): Example database implementation
- Simple RAG with vector store integration
- Context enrichment algorithms
- Multi-faceted filtering systems
- Fusion retrieval mechanisms
- Intelligent reranking
- Query transformation
- Hierarchical indexing
- HyDE implementation
- Dynamic chunk sizing
- Semantic chunking
- Context compression
- Explainable retrieval
- Graph RAG implementation
- RAPTOR integration
- Retrieval with feedback loops
- Adaptive retrieval systems
- Iterative retrieval mechanisms
- Ensemble retrieval implementation
- Multi-modal integration
- Self RAG optimization
- Corrective RAG systems
- 🦙 RAG Orchestration: Llama-index
- 🔍 Vector Database: Qdrant
- 👁️ Observability: Arize Phoenix
- 📊 Evaluation: RAGAS & Deepeval
We welcome contributions! Please see our contributing guidelines for more information.
This project builds upon research and implementations from various sources. See our acknowledgments section for detailed credits.
This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ for the RAG community