diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index 35f2876cb..05f0db466 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -27,10 +27,12 @@ This will help distribute the data across multiple servers in a Neo4j cluster. If you are creating the property shards on a self-managed server, the server that executes the `neo4j-admin database import` command must have sufficient storage space available for all of the property shards that will be created. ==== +==== Import using S3 + The following example shows how to import a set of CSV files, back them up to S3 using the `--target-location` and `--target-format` options, and then create a database using those seeds in S3. . Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. -If the process is running on the same server as another Neo4j DBMS process, the latter must be stopped. +If the `neo4j-admin` process is running on the same server as a Neo4j DBMS process, the Neo4j DBMS process must be stopped. The `--target-location` and `--target-format` options take the outputs of the import, turn them into uncompressed backups, and upload them to a location ready to be seeded from. + [source, shell] @@ -50,6 +52,37 @@ OPTIONS { }; ---- +[role=label--new-2025.12] +==== Import using local file system +You can import data into a Neo4j cluster that has no access to any cloud. + +. Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. +If the `neo4j-admin` process is running on the same server as a Neo4j DBMS process, the Neo4j DBMS process must be stopped. ++ +[source, shell] +---- +neo4j-admin database import full foo-sharded --nodes=nodes.csv --nodes=movies.csv --relationships=relationships.csv --input-type=csv --property-shard-count=3 --schema=schema.cypher +---- + +. Using allow and deny database allocate a single shard to each server in the cluster. +See xref:clustering/databases.adoc#cluster-allow-deny-db[Controlling locations with allowed/denied databases] +. Move the produced backups from the local file system of the machine used for the import to the servers hosting each of the shards. +Each server should have one backup, and the backups must reside in the same path on each server. +. On each server, update the _neo4j.conf_ to include the correct settings for file seeding as outlined in xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. +. Create the database `foo-sharded` as a sharded property database by seeding it from your backups in the servers file systems: ++ +[source, cypher] +---- +CREATE DATABASE `foo-sharded` +DEFAULT LANGUAGE CYPHER 25 +PROPERTY SHARDS { COUNT 3 } +OPTIONS { + seedUri: `file:/backusp/`, seedOptions: 'NO_CHECK' +}; +---- + +In this context, `NO_CHECK` prevents the seeding process from verifying that all backups are present on all servers. + The cluster automatically distributes the data across its servers. For more information on seed providers, see xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI].