ai/RAGs/vector_unstructured at main · ryanmeinzer/ai

History

Name		Name	Last commit message	Last commit date
parent directory ..
.gitignore		.gitignore
README.md		README.md
run.py		run.py

README.md

Unstructured Data in Vector Database for Question-Answer Retrieval Augmented Generation

This repo loads unstructured data from the web, splits then indexes it into a vector database, then queries the database using semantically similar embeddings to generate an answer.

Technologies

Neo4jVector - Vector DB
OpenAI - LLM for QA, Vector Embeddings and RAG
LangChain - Framework to build apps with LLMs

QA RAG Chain Logs (outputted by running repo)

[Question]
[Initial Prompt]
[Question into Embedding]
[Retrieved Similar Embeddings]
[Search Result]
[Final Prompt]
[Answer]
[Tokens]
[Time]

Prerequisites

Sign Up for Neo4j Aura DB
Sign Up for OpenAI
Sign Up for LangSmith

Run

In the root of this repo, create a .env file with the below keys alongside [your-values]:

OPENAI_API_KEY=[your-value]
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY=[your-value]
NEO4J_URI=[your-value]
NEO4J_USERNAME=[your-value]
NEO4J_PASSWORD=[your-value]

In run.py:

adjust the wikipedia_query and user_query variables according to your preference.

While in the root of this repo, in the CLI run:

python run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vector_unstructured

vector_unstructured

README.md

Unstructured Data in Vector Database for Question-Answer Retrieval Augmented Generation

Technologies

QA RAG Chain Logs (outputted by running repo)

Prerequisites

Run

Files

vector_unstructured

Directory actions

More options

Directory actions

More options

Latest commit

History

vector_unstructured

Folders and files

parent directory

README.md

Unstructured Data in Vector Database for Question-Answer Retrieval Augmented Generation

Technologies

QA RAG Chain Logs (outputted by running repo)

Prerequisites

Run