An intelligent documentation retrieval system that uses LLMs (Large Language Models) and vector search to provide accurate answers to questions about any web content or documentation.
- URL Content Processing: Extract and process content from any web URL
- Intelligent Text Chunking: Splits content into meaningful chunks while preserving context
- Vector Search: Uses Chroma DB for efficient similarity search
- LLM Integration: Powered by Ollama for natural language understanding
- Web Interface: Simple and intuitive UI for asking questions
- Session Management: Maintains separate contexts for different users
- Python 3.10 or higher
- Ollama server running locally or remotely
- Required Python packages (see Installation)
- Clone the repository:
cd ChatwithURL
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
- Install the required packages:
pip install -r requirements.txt
.
├── main.py # FastAPI application entry point
├── utils.py # Core processing functions
├── static/
│ └── index.html # Web interface
├── requirements.txt # Project dependencies
└── README.md
- Set up your Ollama server URL or add you LLM in in
utils.py
:
base_url = 'https://your-ollama-server.com'
- Configure the vector store directory in
utils.py
:
VECTORSTORE_DIR = "path/to/vectorstore"
- Start the FastAPI server:
python main.py
- Open your web browser and navigate to:
http://localhost:8000
- In the web interface:
- Enter a URL to process
- Wait for content processing to complete
- Ask questions about the content
- Receive AI-powered answers based on the content
-
POST /set_url/
: Process a new URL{ "url": "https://example.com", "session_id": "unique-session-id" }
-
POST /ask_question/
: Ask a question about the processed content{ "question": "What is this article about?", "session_id": "unique-session-id" }
The web interface provides:
- URL input field
- Question input field
- Response display area
- Session management
- Error handling and status messages
The system handles various error cases:
- Invalid URLs
- Unreachable content
- Processing failures
- LLM server issues
- Session management errors
- Fetches URL and extract the content
- Split the text using RecursiveCharacterTextSplitter
- Preserves document structure
- Uses Chroma DB for vector storage
- HuggingFace embeddings (all-MiniLM-L6-v2)
- Efficient similarity search
- Persistent storage
- Uses Chatbedrock for natural language processing
- Context-aware responses
- Source attribution
- Confidence scoring
- Anthropic model for LLM capabilities
- LangChain for the chain infrastructure
- ChromaDB for vector storage
- HuggingFace for embeddings