A premium Retrieval-Augmented Generation (RAG) platform built on Endee, a blazingly fast open-source vector database. Curator AI transforms static documents into a dynamic, searchable, and intelligent private database with a state-of-the-art UI/UX.
- Live Demo*: Live
Ask complex questions about your private documents. The system retrieves the most relevant context from Endee to provide accurate, fact-grounded answers.
- Visual OCR Support: Integrated Gemini Vision reads handwritten notes or scanned PDFs.
- Stateful Memory: Maintains context for fluid, multi-turn technical conversations.
- Sleek Glassmorphism UI: A modern, theme-aware interface that adapts to Light and Dark modes.
Gain deep insights into your knowledge base performance and user curiosity.
- Real-time Velocity Tracking: Monitor query volume and retrieval speeds via dynamic area charts.
- Topic Clustering: Identify "Market Analysis" trends and other most-queried topics automatically.
- Document Health: Track ingestion statistics and growth metrics (+12% vs LW).
| Step | Functionality | Powered By |
|---|---|---|
| 1. Text Extraction | Parses PDF, MD, and Text (including Vision OCR) | PyMuPDF + Gemini-Flash |
| 2. Vectorization | Converts text into 384-dim semantic embeddings | S-Transformers |
| 3. Vector Storage | Blazing-fast indexing and similarity search | Endee Vector Database |
| 4. Retrieval | Finds the top context chunks for any query | Endee.query() |
| 5. Generation | Generates professional, grounded answers | Google Gemini 2.0-Flash |
graph TD
A["Documents (PDF/MD/TXT)"] --> B["Python Extraction / Vision OCR"]
B --> C["Chunking & Embedding"]
C -->|384-dim vectors| E[("⚡ Endee Vector Store")]
F[" User Question"] --> G["Semantic Embedding"]
G -->|Query vector| E
E -->|"Top Context Matches"| H["LLM Prompt Assembly"]
F --> H
H --> I[" Google Gemini LLM"]
I --> J[" Fact-Grounded Answer"]
style E fill:#1a1a2e,stroke:#e94560,stroke-width:2px,color:#fff
style J fill:#0f3460,stroke:#16213e,stroke-width:2px,color:#fff
git clone https://github.com/Vikash9546/endee.git
cd endee/assignment
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txtYou can run Endee via Docker or by building the source:
# Via Docker
docker run -p 8080:8080 endeeio/endee-server:latestEnsure your .env file contains your GEMINI_API_KEY:
streamlit run app.pyassignment/app.py: Main entry point (State management & Analytics logic).assignment/ui.py: Premium Styling Engine (Glassmorphism & Theme detection).assignment/logic.py: Core RAG Pipeline (Endee Indexing, Chunking, LLM).assignment/assets/: Branding and profile media.assignment/stats.json: Persistent analytics storage.
Curator AI is optimized for Streamlit Cloud and Railway.
- Database (Railway): Deploy
endeeio/endee-serverand setPORTto8080. - Frontend (Streamlit Cloud): Connect your repo and set
GEMINI_API_KEYandNDD_URLas secrets.