Retrieval-Augmented Generation Chat Bot using Ollama, Langchain and Gradio.
The notebook is a proof of concept on how to build a retrieval-augmented generation chatbot using Ollama, Langchain and Gradio. The chatbot is built using the following components:
- Ollama is used as backend to host large language models and provide an API to interact with them.
- Langchain is used as library to generate chunks from provided markdown files and embedd them using Ollama. The embeddings are stored in a chroma database.
- Gradio is used to provide a simple chat interface to interact with the RAG-Chatbot.
- Ollama to host the language models.
- Installation instructions can be found in the Ollama Github Repository
- Minicoda or other conda distribution (Optional but recommended).
- Installation instructions can be found in the Conda Documentation
- Poetry to install the required python packages (Optional but recommended).
- Installation instructions can be found in the Poetry Documentation
- Pull the desired model for ollama and start the ollama backend using the following command:
# change model to the desired model name -> see https://ollama.com/library for other models
ollama pull llama2:chat
ollama start
- Create and activate a virtual environment using conda:
# create env
conda create -n open_rag_chat python=3.11
# activate env
conda activate open_rag_chat
- Install the required packages using poetry:
poetry install
- On first run set the
initial_db = True
. This will create new embeddings for the provided markdown files and create a new chroma db in the given path (DATA_PATH = "data/"
). - Drop your own markdown files in the
data/
folder. - Run the notebook.
- Ollama: Ollama Github Repository
- Langchain: Langchain Documentation
- Gradio: Gradio Chatinterface Documentation
- Similar Project: Langchain RAG Tutorial by pixegami