This is a simple RAG service running everything locally that uses Vespa or OpenSearch as the VectorStore and an ollama model.
NOTE: The OpenSearch implementation is still work-in-progress and is not yet ready to be used.
The default setup will read epub books from the books
directory for the RAG.
Please copy your favorite epub books in there with filenames ending with .epub
.
Install ollama
The default configuration is using the mistral:7b:
ollama pull mistral:7b
To list your local ollama models:
ollama list
# For more details on the models do:
curl -s localhost:11434/api/tags | jq .
You need to start a Vespa version 8 cluster:
docker run --detach \
--name vespa \
--hostname vespa-tutorial \
--publish 8080:8080 \
--publish 19071:19071 \
--publish 19092:19092 \
--publish 19050:19050 \
vespaengine/vespa:8
Note: the 19050 port is not absolutely necessary, but has a nice status page for the Vespa cluster once you have your Vespa doc-types in place.
Install the vespa-cli if needed:
brew install vespa-cli
Run from the root of this repo:
vespa deploy --wait 300 vespa
If you used the above docker command to expose the 19050 port then you can monitor the Cluster status on this page: http://127.0.0.1:19050/clustercontroller-status/v1/llm
To kill (and delete all data from) the Vespa cluster just:
docker rm -f vespa
# Delete all books
curl -X DELETE \
"http://localhost:8080/document/v1/embeddings/books/docid?selection=true&cluster=llm"
# Delete all news
curl -X DELETE \
"http://localhost:8080/document/v1/embeddings/news/docid?selection=true&cluster=llm"
Follow the instructions to set up a single node OpenSearch server with docker.
Using docker-compose:
cd opensearch
wget https://raw.githubusercontent.com/opensearch-project/documentation-website/2.12/assets/examples/docker-compose.yml
# Setup your admin password
echo "OPENSEARCH_INITIAL_ADMIN_PASSWORD=$OPENSEARCH_INITIAL_ADMIN_PASSWORD" > .env
# Start the containers as detached daemons:
docker-compose up -d
Check that OpenSearch is up and running:
curl -ku "admin:$OPENSEARCH_INITIAL_ADMIN_PASSWORD" https://localhost:9200
# If the docker containers do not start then check the server logs:
docker logs opensearch-node1
Things that might go wrong above are:
- Not enough strong admin password
- Not setting the sysctl limits
Please set the OPENSEARCH_INITIAL_ADMIN_PASSWORD
env variable to a
strong password
as OpenSearch will not start otherwise.
Open http://localhost:5601 and login as admin with the OPENSEARCH_INITIAL_ADMIN_PASSWORD
password you created above.
Make sure you set the configuration to what you want to use.
mvn clean compile package
# Populate the Vector store
./target/langchain4j-local-rag-sample-0.0.1-assembly/bin/rag-sample-create-embeddings.sh
# Chat
./target/langchain4j-local-rag-sample-0.0.1-assembly/bin/rag-sample-cli.sh
# Start GRPC server
./target/langchain4j-local-rag-sample-0.0.1-assembly/bin/rag-sample-grpc-service.sh
# Call the service
grpcurl --plaintext -d '{"question": "What is the Foundation?"}' 127.0.0.1:4242 ragsample.RagSample.Ask
Some alternative prompts:
prompt.template = """You are a helpful assistant, conversing with a user about the subjects contained in a set of documents.
Use the information from the DOCUMENTS section to provide accurate answers. If unsure or if the answer
isn't found in the DOCUMENTS section, simply state that you don't know the answer.
QUESTION:
{{userMessage}}
DOCUMENTS:
{{contents}}
"""
prompt.template = """Context information is below.
---------------------
{{contents}}
---------------------
Given the context information above and no prior knowledge, provide answers based on the below query.
{{userMessage}}
"""