Skip to content

Commit 12c8b0e

Browse files
committed
Support Nebula Graph
1 parent 1c53c71 commit 12c8b0e

File tree

1 file changed

+69
-0
lines changed

1 file changed

+69
-0
lines changed
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
- Feature Name: Support Nebula Graph
2+
- Start Date: 2022-04-16
3+
- RFC PR: [amundsen-io/rfcs#48](https://github.com/amundsen-io/rfcs/pull/48)
4+
- Amundsen Issue: [amundsen-io/amundsen#1816](https://github.com/amundsen-io/amundsen/issues/1816)
5+
6+
# Nebula Graph Support
7+
8+
## Summary
9+
10+
The support includes the Nebula Graph data builder and a new proxy for Nebula Graph in metadata service.
11+
12+
## Motivation
13+
14+
Metadata can be published into Nebula Graph from Amundsen data builder now and this RFC is going to make metadata retrieval API work for Nebula Graph in metadata service.
15+
16+
## Guide-level Explanation (aka Product Details)
17+
18+
The goal of this RFC is to add additional loaders, publishers, and serializers to the library suite so that Nebula Graph is supported.
19+
20+
## UI/UX-level Explanation
21+
22+
N/A
23+
24+
## Reference-level Explanation (aka Technical Details)
25+
26+
A new proxy for Nebula Graph is added to support Nebula Graph in the metadata service.
27+
28+
To support Nebula Graph in the data builder, several new components are added:
29+
30+
- Nebula Extractor
31+
- Nebula Search Data Extractor
32+
- Nebula CSV Loader
33+
- Nebula CSV Publisher
34+
- Nebula Serializer
35+
- Nebula Sample Data Loader
36+
37+
### Nebula Graph Schema handling
38+
39+
They worked quite similarly to those components for Neo4j(it even speaks a dialect of OpenCypher) but differentiated in one thing: Nebula is not schemaless like Neo4j, that is, a label(named tag in Nebula Graph) or an edge type should be created before it's being referred in a query.
40+
41+
Instead of maintaining a versioned schema by the user(with extra interfaces introduced), the proposed design is to parse the schema from data being published(Nebula CSV Publisher), which will do a schema check and DDL change when needed automatically, where the schema information will be in a single source of truth: the data model of the data builder.
42+
43+
This requires the user to run the Nebula data builder sample script to initialize the Nebula Graph Schema, which could be discussed/revisited.
44+
45+
### Nebula Graph Index
46+
47+
Another difference in Nebula towards Neo4j is, when it comes to "starting point seeking" of an OpenCypher `MATCH` Query, if non of the `key`(it's called Vertex ID in Nebula Graph) is provided, an index on the LABEL(tag) should be created(or a `LIMIT` clause is added), that is either no conditions at all: `MATCH (n: Table) return n` or conditions for starting point is only under a property: `MATCH (n: Table{category: "foo"}) return n` in the query. [Here](https://siwei.io/en/nebula-index-explained/) is a post where I explained why it's designed so.
48+
49+
All those data models with queries in meta service API required indexes were handled by Nebula Publisher now, included in `NEBULA_INDEX_TAG_FIELDS`.
50+
51+
## Drawbacks
52+
53+
The RFC adds support for another datastore which brings in additional components and increases the code size of the repo.
54+
55+
## Alternatives
56+
57+
For the Nebula Schema auto adaptation in Nebula Publisher, one alternative is adding an extra interface/utility to manage the Nebula Graph schema.
58+
59+
## Prior art
60+
61+
N/A
62+
63+
## Unresolved questions
64+
65+
N/A
66+
67+
## Future possibilities
68+
69+
None so far.

0 commit comments

Comments
 (0)