Skip to content

Latest commit

 

History

History

graph_unstructured

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Unstructured Data in Graph Database for Question-Answer Retrieval Augmented Generation

This repo loads unstructured data from the web, converts then indexes it into a graph database, then queries the database using a generated Cypher statement to generate an answer.

Technologies

Diffbot NLP API - Graph Construction
Neo4jGraph - Graph DB
OpenAI - LLM for QA, Cypher and RAG
LangChain - Framework to build apps with LLMs

QA RAG Chain Logs (outputted by running repo)

[Question]
[Initial Prompt]
[Cypher Graph Search]
[Search Result]
[Final Prompt]
[Answer]
[Tokens]
[Time]

Prerequisites

Run

In the root of this repo, create a .env file with the below keys alongside [your-values]:

OPENAI_API_KEY=[your-value]
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY=[your-value]
NEO4J_URI=[your-value]
NEO4J_USERNAME=[your-value]
NEO4J_PASSWORD=[your-value]
DIFFBOT_KEY=[your-value]

In run.py:

adjust the wikipedia_query and user_query variables according to your preference.

While in the root of this repo, in the CLI run:

python run.py