Neo4j-QA

My thesis project at Seavus is a question-answering system leveraged by Neo4j that processes large, unstructured text data and fetches results for natural language processing (NLP) tasks ranging from keyword extraction to sentiment analysis. These tasks were made possible in Neo4j through the GraphAware NLP framework.

Setup

In order for this framework to function properly, you first need to add a few JAR plugins from both GraphAware and the Stanford CoreNLP. You will also want to download the graph algorithm library APOC that is freely available in Neo4j. It is required for some of the NLP tasks as well.

Requirements:

Neo4j 3.5.14 (or earlier)
graphaware-server-all
nlp
nlp-stanfordnlp
stanford-english-corenlp
apoc

Once the above plugins are placed in neo4j.plugins in NEO4J_HOME/plugins/, these lines are required in the neo4j.conf file in NEO4J_HOME/conf/:

dbms.unmanaged_extension_classes=com.graphaware.server=/graphaware
com.graphaware.runtime.enabled=true
dbms.security.procedures.whitelist=ga.nlp.*, apoc.*
dbms.security.procedures.unrestricted=ga.nlp.*, apoc.*

You will also need to allocate an appropriate heap size and page cache for Neo4j:

dbms.memory.heap.initial_size=3000m
dbms.memory.heap.max_size=5000m

Python Driver

In order to connect between Python and Neo4j, change the credentials in text_processor.py and query_pipeline.py to your specifics.

Example:

uri = 'bolt://localhost:7687'
username = 'neo4j'
password = 'gdb'

Data

The BBC dataset used in these experiments were taken from the examples here. They are in archived format and can be processed in text_processor.py, which feeds the news articles into the graph database and defines the schema of the knowledge graph. There are additional methods to call for enrichment, keyword extraction, and text summarization.

Demo

After the text is proccessed in Neo4j, simply test out the demo_pipeline with the query_pipeline in the same folder:

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neo4j-QA

Setup

Python Driver

Data

Demo

About

Uh oh!

Releases

Packages

Languages

License

perkdrew/Neo4j-QA

Folders and files

Latest commit

History

Repository files navigation

Neo4j-QA

Setup

Python Driver

Data

Demo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages