Yet Another Retrieval Script
The main goal of this especific branch of YARS is to share a light-weight but powerful and open-source base for people to talk with an SQL database and find information directly by asking questions in natural language. The core components of this chatbot are ollama and langchain libraries, both free to use! The main advantage of this implementation is the possibility to choose from a list of different LLM models provided by Ollama. You are in control of which models you want to run locally!
YARS will continue to grow with more functionality, with the target audience being the scientific community. I'll write more about my personal short- and long-term goals of this chatbot soon :)
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
-
YARS is developed with Python, so we start there and it is assumed that you have a working version of Python in your system. If not, then I recommend to follow the instructions from the Python website. There are also thousands of tutorials on the web, one I recommend is The Hitchhiker's Guide to Python. Go for Python 3.8 or newer!
Verify that Python is running with:
python --version
The output should return the version of the Python libraries installed in your system. Verify also that the package installer for Python, PIP is installed.
-
You'll need to have the Ollama server running in your machine:
-
We will be working with PostgreSQL as the main database server for this example. Using other database flavours will require you to check and adapt the creation of the SQLDatabse instance with the right URI syntax for your database. You will also need to edit the SQL queries inside the utils/sql_examples.json file so that it follows the correct syntax of your SQL commands.
In this example we'll be using the Chinook database. The Chinook database can be recreated locally by downloading the respective SQL script from here. In case you want to work with a different database you will need to carefully curate the examples of question + SQL query inside the utils/sql_examples.json file for better accuracy in the interaction with the LLM.
-
First, download the repository as a ZIP file or (assuming you have already installed the github package) just open a terminal and
git clone
it. Go inside the YARS folder and I recommend to work under a virtual environment; create one and activate it before installing the requirements:python -m venv .venv
-
Activate the environment:
Windows:
.\.venv\Scripts\activate
Linux:
source ./.venv/bin/activate
-
All the required dependencies are listed inside the requirements.txt file. To install them just run:
python -m pip install -r requirements.txt
-
For the time being, the generation of images works with ReCraft which is one of the best, fastest and cheapest Text-to-Image generators at the moment. Once you create your profile, you'll have to create an API key to communicate with them. In case you prefer to use a different image generators you'll have to modify the client when initializing the Assistant class and the way you generate images inside the respond method in assistant.py.
Calling an external API implies that you won't be completely offline so my TODO list will include a way of generating images from prompt locally :) This option of course requires a powerful machine to be fast and reliable.
-
Finally, the file utils/.env_example contains the variables necessary to connect to your database. You should either edit and rename that file into utils/.env or create a new file named utils/.env with the correct values.
- To start interacting with your databases, just run the call the main.py script inside a terminal.
python app.py [-h]
- The script will give you access to a link at your localhost on port 7860 (http://127.0.0.1:7860) assuming you work with Gradio's default values. Just open your favorite browser and write that address in the URL field. Follow the instructions and enjoy!