(open)Cypher implementation of the LDBC SNB BI benchmark.
The data set needs to be generated and preprocessed before loading it to the database. To generate it, use the CSVComposite
serializer classes of the DATAGEN project:
ldbc.snb.datagen.serializer.personSerializer:ldbc.snb.datagen.serializer.snb.interactive.CSVCompositePersonSerializer
ldbc.snb.datagen.serializer.invariantSerializer:ldbc.snb.datagen.serializer.snb.interactive.CSVCompositeInvariantSerializer
ldbc.snb.datagen.serializer.personActivitySerializer:ldbc.snb.datagen.serializer.snb.interactive.CSVCompositePersonActivitySerializer
Go to the load-scripts/
directory.
Set the following environment variables appropriately:
export NEO4J_HOME=/path/to/the/neo4j/dir
export NEO4J_DB_DIR=$NEO4J_HOME/data/databases/graph.db
export NEO4J_DATA_DIR=/path/do/the/csv/files
export POSTFIX=_0_0.csv
The CSV files require a bit of preprocessing:
- replace headers with Neo4j-compatible ones
- replace labels (e.g. change
city
toCity
) - convert date and datetime formats
The following script takes care of those steps:
./convert-csvs.sh
Be careful -- this deletes all data in your database, imports the SNB data set and restarts the database.
./delete-neo4j-database.sh
./import-to-neo4j.sh
./restart-neo4j.sh
If you know what you're doing, you can run all scripts with a single command:
./load-in-one-step.sh