This project is a PDF chatbot application powered by Chainlit, LangChain, and Chroma. The app allows users to upload PDF and text documents, process them into chunks, and then query them conversationally. By leveraging the latest in AI-based embeddings and language models, the chatbot provides insightful answers while referencing relevant document sources.
- File Upload Support: Users can upload PDF and plain text files to interact with their content.
- Chunk Processing: Large documents are split into manageable chunks for efficient search and retrieval.
- Vector Search with Chroma: Documents are embedded using SentenceTransformers and stored in Chroma, a fast and scalable vector store.
- Conversational QA: Powered by ChatGroq, users can ask questions in natural language and receive detailed responses with cited sources.
- Real-Time Document Updates: Uploaded files are dynamically processed and added to the document corpus.
- Streamed Responses: Answers are streamed for an engaging user experience.
- Chainlit: For building and deploying the chatbot interface.
- LangChain: To manage document loaders, text splitting, and retrieval chains.
- SentenceTransformers: For generating embeddings of document text and user queries.
- Chroma: A high-performance vector database for storing and searching document embeddings.
- ChatGroq: A state-of-the-art LLM for generating responses to user queries.
- RecursiveCharacterTextSplitter: For chunking large documents.
- PyPDFLoader: For extracting content from PDF files.
- TextLoader: For loading plain text files.
- Python 3.8+
- pip: Package manager for Python.
-
Clone the repository:
git clone https://github.com/oss-bit/PDF-Chat.git cd PDF-Chat
-
Install the required packages:
pip install -r requirements.txt
-
Set up API keys for ChatGroq:
- Replace the placeholder
qroq_api_key
in the code with your actual API key.
- Replace the placeholder
-
Start the chatbot:
chainlit run main.py -w
-
Open the browser interface (usually at
http://localhost:8000
). -
Upload a document (PDF or text) using the clip icon.
-
Ask questions about your uploaded documents, and get real-time answers with cited sources.
- Loads files and splits them into chunks for easier embedding and retrieval.
- Processes the documents and stores them in a Chroma vector store.
- Initializes the chatbot with a welcome message.
- Handles user interactions, processes uploaded documents, and answers queries.
- Upload a File: Drag and drop your PDF or text file into the chatbot interface.
- Ask Questions: Type in queries like:
- "What is the main topic of the document?"
- "Provide details from section X of the file."
- Receive Answers: The bot responds with detailed answers and references to the source document.
Feel free to open issues or submit pull requests. Contributions are welcome!
- Fork the repository.
- Create a feature branch.
- Commit your changes and submit a pull request.
This project is licensed under the MIT License. See the LICENSE
file for details.
Enjoy seamless interaction with your documents! 🚀