Skip to content

Commit 47a1046

Browse files
Implement cluster support (#70)
## What is the goal of this PR? We add CLI options that allow TypeDB Loader to connect to TypeDB Cluster. ## What are the changes implemented in this PR? * Reformat README * Add Cluster support
1 parent eb82705 commit 47a1046

14 files changed

+169
-91
lines changed

README.md

Lines changed: 98 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,77 +1,94 @@
1-
2-
31
![TypeDBLoader_icon](https://github.com/bayer-science-for-a-better-life/grami/blob/master/typedbloader.png?raw=true)
42
---
53
---
6-
###
4+
5+
###
6+
77
[![TypeDB Loader Test](https://github.com/bayer-science-for-a-better-life/grami/actions/workflows/testandbuild.yaml/badge.svg)](https://github.com/bayer-science-for-a-better-life/grami/actions/workflows/testandbuild.yaml)
88
[![TypeDB Loader Build](https://github.com/bayer-science-for-a-better-life/grami/actions/workflows/release.yaml/badge.svg)](https://github.com/bayer-science-for-a-better-life/grami/actions/workflows/release.yaml)
9+
910
###
1011

1112
---
1213

1314
If your [TypeDB](https://github.com/vaticle/typedb) project
14-
- has a lot of data
15-
- and you want/need to focus on schema design, inference, and querying
1615

17-
Use TypeDB Loader to take care of your data migration for you. TypeDB Loader streams data from files and migrates them into TypeDB **at scale**!
18-
16+
- has a lot of data
17+
- and you want/need to focus on schema design, inference, and querying
18+
19+
Use TypeDB Loader to take care of your data migration for you. TypeDB Loader streams data from files and migrates them
20+
into TypeDB **at scale**!
21+
1922
## Features:
20-
- Data Input:
23+
24+
- Data Input:
2125
- data is streamed to reduce memory requirements
2226
- supports any tabular data file with your separator of choice (i.e.: csv, tsv, whatever-sv...)
2327
- supports gzipped files
2428
- ignores unnecessary columns
25-
- [Attribute](https://github.com/typedb-osi/typedb-loader/wiki/02-Loading-Attributes), [Entity](https://github.com/typedb-osi/typedb-loader/wiki/03-Loading-Entities), [Relation](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations) Loading:
29+
- [Attribute](https://github.com/typedb-osi/typedb-loader/wiki/02-Loading-Attributes), [Entity](https://github.com/typedb-osi/typedb-loader/wiki/03-Loading-Entities), [Relation](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations)
30+
Loading:
2631
- load required/optional attributes of any TypeDB type (string, boolean, long, double, datetime)
2732
- load required/optional role players (attribute / entity / relation)
28-
- load list-like attribute columns as n attributes (recommended procedure until attribute lists are fully supported by TypeDB)
33+
- load list-like attribute columns as n attributes (recommended procedure until attribute lists are fully supported
34+
by TypeDB)
2935
- load list-like player columns as n players for a relation
3036
- load entity if not present - if present, either do not write or append attributes
31-
- [Appending Attributes](https://github.com/typedb-osi/typedb-loader/wiki/05-Appending-Attributes) to existing things
32-
- [Append-Attribute-Or-Insert-Entity](https://github.com/typedb-osi/typedb-loader/wiki/06-Append-Or-Insert) for entities
33-
- Data Validation:
34-
- validate input data rows and log issues for easy diagnosis input data-related issues (i.e. missing attributes/players, invalid characters...)
35-
- Configuration Validation:
36-
- write your configuration with confidence: warnings will display useful information for fine tuning, errors will let you know what you forgot. All BEFORE the database is touched.
37-
- Performance:
38-
- parallelized asynchronous writes to TypeDB to make the most of your hardware configuration, optimized with engineers @vaticle
39-
- Stop/Restart (in re-implementation, currently NOT available):
37+
- [Appending Attributes](https://github.com/typedb-osi/typedb-loader/wiki/05-Appending-Attributes) to existing things
38+
- [Append-Attribute-Or-Insert-Entity](https://github.com/typedb-osi/typedb-loader/wiki/06-Append-Or-Insert) for entities
39+
- Data Validation:
40+
- validate input data rows and log issues for easy diagnosis input data-related issues (i.e. missing
41+
attributes/players, invalid characters...)
42+
- Configuration Validation:
43+
- write your configuration with confidence: warnings will display useful information for fine tuning, errors will
44+
let you know what you forgot. All BEFORE the database is touched.
45+
- Performance:
46+
- parallelized asynchronous writes to TypeDB to make the most of your hardware configuration, optimized with
47+
engineers @vaticle
48+
- Stop/Restart (in re-implementation, currently NOT available):
4049
- tracking of your migration status to stop/restart, or restart after failure
4150

42-
- [Basic Column Preprocessing using RegEx's](https://github.com/typedb-osi/typedb-loader/wiki/08-Preprocessing)
51+
- [Basic Column Preprocessing using RegEx's](https://github.com/typedb-osi/typedb-loader/wiki/08-Preprocessing)
4352

44-
Create a Loading Configuration ([example](https://github.com/typedb-osi/typedb-loader/blob/master/src/test/resources/phoneCalls/config.json)) and use TypeDB Loader
45-
- as an [executable CLI](https://github.com/typedb-osi/typedb-loader/wiki/10-TypeDB-Loader-as-Executable-CLI) - no coding
46-
- in [your own Java project](https://github.com/typedb-osi/typedb-loader/wiki/09-TypeDB-Loader-as-Dependency) - easy API
53+
Create a Loading
54+
Configuration ([example](https://github.com/typedb-osi/typedb-loader/blob/master/src/test/resources/phoneCalls/config.json))
55+
and use TypeDB Loader
56+
57+
- as an [executable CLI](https://github.com/typedb-osi/typedb-loader/wiki/10-TypeDB-Loader-as-Executable-CLI) - no
58+
coding
59+
- in [your own Java project](https://github.com/typedb-osi/typedb-loader/wiki/09-TypeDB-Loader-as-Dependency) - easy API
4760

4861
## How it works:
4962

50-
To illustrate how to use TypeDB Loader, we will use a slightly extended version of the "phone-calls" example [dataset](https://github.com/typedb-osi/typedb-loader/tree/master/src/test/resources/phoneCalls) and [schema](https://github.com/typedb-osi/typedb-loader/blob/master/src/test/resources/phoneCalls/schema.gql) from the TypeDB developer documentation:
63+
To illustrate how to use TypeDB Loader, we will use a slightly extended version of the "phone-calls"
64+
example [dataset](https://github.com/typedb-osi/typedb-loader/tree/master/src/test/resources/phoneCalls)
65+
and [schema](https://github.com/typedb-osi/typedb-loader/blob/master/src/test/resources/phoneCalls/schema.gql) from the
66+
TypeDB developer documentation:
5167

5268
### Configuration
5369

5470
The configuration file tells TypeDB Loader what things you want to insert for each of your data files and how to do it.
5571

5672
Here are some example:
5773

58-
- [Attribute Examples](https://github.com/typedb-osi/typedb-loader/wiki/02-Loading-Attributes)
59-
- [Entity Examples](https://github.com/typedb-osi/typedb-loader/wiki/03-Loading-Entities)
60-
- [Relation Examples](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations)
61-
- [Nested Relation - Match by Attribute(s) Example](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations#loading-relations-with-entityrelation-players-matched-on-attribute-ownerships-incl-nested-relations)
62-
- [Nested Relation - Match by Player(s) Example](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations#loading-relations-relation-players-matching-on-players-in-playing-relation-incl-nested-relations)
63-
- [Attribute-Player Relation Example](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations#loading-relations-with-attribute-players)
64-
- [Custom Migration Order Example](https://github.com/typedb-osi/typedb-loader/wiki/07-Custom-Load-Order)
74+
- [Attribute Examples](https://github.com/typedb-osi/typedb-loader/wiki/02-Loading-Attributes)
75+
- [Entity Examples](https://github.com/typedb-osi/typedb-loader/wiki/03-Loading-Entities)
76+
- [Relation Examples](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations)
77+
- [Nested Relation - Match by Attribute(s) Example](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations#loading-relations-with-entityrelation-players-matched-on-attribute-ownerships-incl-nested-relations)
78+
- [Nested Relation - Match by Player(s) Example](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations#loading-relations-relation-players-matching-on-players-in-playing-relation-incl-nested-relations)
79+
- [Attribute-Player Relation Example](https://github.com/typedb-osi/typedb-loader/wiki/04-Loading-Relations#loading-relations-with-attribute-players)
80+
- [Custom Migration Order Example](https://github.com/typedb-osi/typedb-loader/wiki/07-Custom-Load-Order)
6581

6682
For detailed documentation, please refer to the [WIKI](https://github.com/bayer-science-for-a-better-life/grami/wiki).
6783

68-
The [config](https://github.com/typedb-osi/typedb-loader/tree/master/src/test/resources/phoneCalls/config.json) in the phone-calls test is a good starting example of a configuration.
84+
The [config](https://github.com/typedb-osi/typedb-loader/tree/master/src/test/resources/phoneCalls/config.json) in the
85+
phone-calls test is a good starting example of a configuration.
6986

7087
### Migrate Data
7188

7289
Once your configuration files are complete, you can use TypeDB Loader in one of two ways:
7390

74-
1. As an executable command line interface - no coding required:
91+
1. As an executable command line interface - no coding required:
7592

7693
```Shell
7794
./bin/typedbloader load \
@@ -83,65 +100,83 @@ Once your configuration files are complete, you can use TypeDB Loader in one of
83100

84101
[See details here](https://github.com/typedb-osi/typedb-loader/wiki/10-TypeDB-Loader-as-Executable-CLI)
85102

86-
2. As a dependency in your own Java code:
103+
2. As a dependency in your own Java code:
87104

88105
```Java
89106
public class LoadingData {
90107

91-
public void loadData() {
92-
String uri = "localhost:1729";
93-
String config = "path/to/your/config.json";
94-
String database = "databaseName";
95-
96-
String[] args = {
97-
"load",
98-
"-tdb", uri,
99-
"-c", config,
100-
"-db", database,
101-
"-cm"
102-
};
103-
104-
LoadOptions options = LoadOptions.parse(args);
105-
TypeDBLoader loader = new TypeDBLoader(options);
106-
loader.load();
107-
}
108+
public void loadData() {
109+
String uri = "localhost:1729";
110+
String config = "path/to/your/config.json";
111+
String database = "databaseName";
112+
113+
String[] args = {
114+
"load",
115+
"-tdb", uri,
116+
"-c", config,
117+
"-db", database,
118+
"-cm"
119+
};
120+
121+
LoadOptions options = LoadOptions.parse(args);
122+
TypeDBLoader loader = new TypeDBLoader(options);
123+
loader.load();
124+
}
108125
}
109126
```
110127

111128
[See details here](https://github.com/typedb-osi/typedb-loader/wiki/09-TypeDB-Loader-as-Dependency)
112129

113-
114130
## Step-by-Step Tutorial
115131

116132
A complete tutorial for TypeDB version >= 2.5.0 is in work and will be published asap.
117133

118-
An example of configuration and usage of TypeDB Loader on real data can be found [in the TypeDB Examples](https://github.com/vaticle/typedb-examples/tree/master/biology/catalogue_of_life).
134+
An example of configuration and usage of TypeDB Loader on real data can be
135+
found [in the TypeDB Examples](https://github.com/vaticle/typedb-examples/tree/master/biology/catalogue_of_life).
119136

120-
A complete tutorial for TypeDB (Grakn) version < 2.0 can be found [on Medium](https://medium.com/@hkuich/introducing-grami-a-data-migration-tool-for-grakn-d4051582f867).
137+
A complete tutorial for TypeDB (Grakn) version < 2.0 can be
138+
found [on Medium](https://medium.com/@hkuich/introducing-grami-a-data-migration-tool-for-grakn-d4051582f867).
121139

122140
There is an [example repository](https://github.com/bayer-science-for-a-better-life/grami-example) for your convenience.
123141

142+
## Connecting to TypeDB Cluster
143+
144+
To connect to TypeDB Cluster, a set of options is provided:
145+
```
146+
--typedb-cluster=<address:port>
147+
--username=<username>
148+
--password // can be asked for interactively
149+
--tls-enabled
150+
--tls-root-ca=<path/to/CA/cert>
151+
```
152+
124153
## Compatibility Table
125154

126-
| TypeDB Loader | TypeDB Client (internal) | TypeDB | TypeDB Cluster |
127-
| :------------: | :---------------------: | :-------------: | :------------: |
128-
| 1.1.0 to 1.2.0 | 2.8.0 | 2.8.x | N/A |
129-
| 1.0.0 | 2.5.0 to 2.7.1 | 2.5.x to 2.7.x | N/A |
130-
| 0.1.1 | 2.0.0 to 2.5.0 | 2.0.x to 2.4.x | N/A |
131-
| <0.1 | 1.8.0 | 1.8.x | N/A |
155+
Ranges are [inclusive, exclusive).
156+
157+
| TypeDB Loader | TypeDB Client (internal) | TypeDB | TypeDB Cluster |
158+
|:--------------:|:------------------------:|:---------------:|:--------------:|
159+
| 1.6.0 | 2.14.0 | 2.14.x | 2.14.x |
160+
| 1.2.0 to 1.6.0 | 2.8.0 - 2.14.0 | 2.8.0 to 2.14.0 | N/A |
161+
| 1.1.0 to 1.2.0 | 2.8.0 | 2.8.x | N/A |
162+
| 1.0.0 | 2.5.0 to 2.7.1 | 2.5.x to 2.7.x | N/A |
163+
| 0.1.1 | 2.0.0 to 2.5.0 | 2.0.x to 2.4.x | N/A |
164+
| <0.1 | 1.8.0 | 1.8.x | N/A |
132165

133166
* [Type DB](https://github.com/vaticle/typedb)
134167

135168
Find the Readme for GraMi for grakn < 2.0 [here](https://github.com/bayer-science-for-a-better-life/grami/blob/b3d6d272c409d6c40254354027b49f90b255e1c3/README.md)
136169

137170
## Contributions
138171

139-
TypeDB Loader was built @[Bayer AG](https://www.bayer.com/) in the Semantic and Knowledge Graph Technology Group with the support of the engineers @[Grakn Labs](https://github.com/orgs/vaticle/people).
172+
TypeDB Loader was built @[Bayer AG](https://www.bayer.com/) in the Semantic and Knowledge Graph Technology Group with
173+
the support of the engineers @[Vaticle](https://github.com/vaticle).
140174

141175
## Licensing
142176

143-
This repository includes software developed at [Bayer AG](https://www.bayer.com/). It is released under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
144-
177+
This repository includes software developed at [Bayer AG](https://www.bayer.com/). It is released under
178+
the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
179+
145180
## Credits
146181

147182
Icon in banner by [Freepik](https://www.freepik.com") from [Flaticon](https://www.flaticon.com/)

build.gradle

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ plugins {
55
}
66

77
group 'com.vaticle.typedb-osi'
8-
version '1.5.1'
8+
version '1.6.0'
99

1010
repositories {
1111
mavenCentral()
@@ -15,13 +15,13 @@ repositories {
1515
}
1616

1717
dependencies {
18-
implementation("com.vaticle.typedb:typedb-client:2.11.1")
19-
implementation("com.vaticle.typeql:typeql-grammar:2.9.0")
18+
implementation("com.vaticle.typedb:typedb-client:2.14.2")
19+
implementation("com.vaticle.typeql:typeql-grammar:2.14.0")
2020
implementation("com.google.code.gson:gson:2.8.6")
2121
implementation("org.slf4j:slf4j-api:1.7.25")
2222
implementation("org.apache.logging.log4j:log4j-api:2.17.1")
2323
implementation("org.apache.logging.log4j:log4j-core:2.17.1")
24-
implementation("info.picocli:picocli:4.5.1")
24+
implementation("info.picocli:picocli:4.6.1")
2525
implementation("org.apache.commons:commons-csv:1.8")
2626
implementation("commons-io:commons-io:2.8.0")
2727
compileOnly("info.picocli:picocli-codegen:4.5.1")

src/main/java/com/vaticle/typedb/osi/loader/cli/LoadOptions.java

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818

1919
import picocli.CommandLine;
2020

21+
import javax.annotation.Nullable;
22+
2123
@CommandLine.Command(name = "load", description = "load data and/or schema", mixinStandardHelpOptions = true)
2224
public class LoadOptions {
2325
@CommandLine.Spec
@@ -26,12 +28,27 @@ public class LoadOptions {
2628
@CommandLine.Option(names = {"-c", "--config"}, description = "config file in JSON format", required = true)
2729
public String dataConfigFilePath;
2830

29-
@CommandLine.Option(names = {"-db", "--database"}, description = "target database in your grakn instance", required = true)
31+
@CommandLine.Option(names = {"-db", "--database"}, description = "target database in your TypeDB instance", required = true)
3032
public String databaseName;
3133

32-
@CommandLine.Option(names = {"-tdb", "--typedb"}, description = "optional - TypeDB server in format: server:port (default: localhost:1729)", defaultValue = "localhost:1729")
34+
@CommandLine.Option(names = {"-tdb", "--typedb"}, description = "Connect to TypeDB Core server in format: server:port (default: localhost:1729)", defaultValue = "localhost:1729")
3335
public String typedbURI;
3436

37+
@CommandLine.Option(names = {"-tdbc", "--typedb-cluster"}, description = "Connect to TypeDB Cluster instead of TypeDB Core. Specify a cluster server with 'server:port'.")
38+
public String typedbClusterURI;
39+
40+
@CommandLine.Option(names = {"--username"}, description = "Username")
41+
public @Nullable String username;
42+
43+
@CommandLine.Option(names = {"--password"}, description = "Password", interactive = true, arity = "0..1")
44+
public @Nullable String password;
45+
46+
@CommandLine.Option(names = {"--tls-enabled"}, description = "Connect to TypeDB Cluster with TLS encryption")
47+
public boolean tlsEnabled;
48+
49+
@CommandLine.Option( names = {"--tls-root-ca"}, description = "Path to the TLS root CA file")
50+
public @Nullable String tlsRootCAPath;
51+
3552
@CommandLine.Option(names = {"-cm", "--cleanMigration"}, description = "optional - delete old schema and data and restart migration from scratch - default: continue previous migration, if exists", defaultValue = "false")
3653
public boolean cleanMigration;
3754

@@ -74,7 +91,14 @@ public void print() {
7491
spec.commandLine().getOut().println("TypeDB Loader started with parameters:");
7592
spec.commandLine().getOut().println("\tconfiguration: " + dataConfigFilePath);
7693
spec.commandLine().getOut().println("\tdatabase name: " + databaseName);
77-
spec.commandLine().getOut().println("\tTypeDB server: " + typedbURI);
94+
if (typedbClusterURI != null) {
95+
spec.commandLine().getOut().println("\tTypeDB cluster URI: " + typedbClusterURI);
96+
spec.commandLine().getOut().println("\tTypeDB cluster username: " + username);
97+
spec.commandLine().getOut().println("\tTypeDB cluster TLS enabled: " + tlsEnabled);
98+
spec.commandLine().getOut().println("\tTypeDB cluster TLS path: " + (tlsRootCAPath == null ? "N/A" : tlsRootCAPath));
99+
} else {
100+
spec.commandLine().getOut().println("\tTypeDB server URI: " + typedbURI);
101+
}
78102
spec.commandLine().getOut().println("\tdelete database and all data in it for a clean new migration?: " + cleanMigration);
79103
spec.commandLine().getOut().println("\treload schema (if not doing clean migration): " + loadSchema);
80104
}

src/main/java/com/vaticle/typedb/osi/loader/loader/TypeDBLoader.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ public TypeDBLoader(LoadOptions options) {
4242

4343
public void load() {
4444
Util.info("validating your config...");
45-
TypeDBClient schemaClient = TypeDBUtil.getClient(options.typedbURI);
45+
TypeDBClient schemaClient = TypeDBUtil.getClient(options);
4646
ConfigurationValidation cv = new ConfigurationValidation(dc);
4747
HashMap<String, ArrayList<String>> validationReport = new HashMap<>();
4848
ArrayList<String> errors = new ArrayList<>();
@@ -83,7 +83,7 @@ public void load() {
8383
Instant start = Instant.now();
8484
try {
8585
AsyncLoaderWorker asyncLoaderWorker = null;
86-
try (TypeDBClient client = TypeDB.coreClient(options.typedbURI, Runtime.getRuntime().availableProcessors())) {
86+
try (TypeDBClient client = TypeDBUtil.getClient(options)) {
8787
Runtime.getRuntime().addShutdownHook(
8888
NamedThreadFactory.create(AsyncLoaderWorker.class, "shutdown").newThread(client::close)
8989
);

0 commit comments

Comments
 (0)