Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gh-534: Initial docs for federated POC #535

Merged
merged 9 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Simple Federation Graph Access Control

Graphs added to a federated store can have restrictions placed on them in
addition to the standard user controls that may be in place on the data itself.

## Restricting Graph Access

To restrict access to a graph you must add the access controls when the graph
is added to the federated store. Once added to a store a graph's access cannot
be altered without removing and re-adding it.

The available restrictions you can apply when adding a graph are as follows,
additional sections on this page provide more detail where needed:

- `owner` - The user ID of the graphs owner. If not specified this will be the
ID of the user who added the graph. The owner by default does not affect the
restrictions on the graph, the user ID has no additional privileges.
- `isPublic` - Is the graph public or not, a public graph can be read by any
user.
- `readPredicate` - This is an access control predicate that is checked when
operations are performed to see if the user running the operations can read the
graph.
- `writePredicate` - This is an access control predicate that is checked when a
user is trying to modify the graph. Modification in this case refers to editing
the configured graph such as, changing the graph ID or deleting the graph from
the store; it does not effect adding or deleting data inside the graph.

A full example of adding a graph with all these restrictions would look like:

!!! example ""
=== "Java"

```java
final String graphOwner = "graphOwner";

final AddGraph operation = new AddGraph.Builder()
.graphConfig(new GraphConfig(graphId))
.schema(new Schema())
.properties(new Properties())
.owner(graphOwner)
.isPublic(true)
.readPredicate(new AccessPredicate(
new DefaultUserPredicate(graphOwner, Arrays.asList("readAuth1", "readAuth2"))))
.writePredicate(new AccessPredicate(
new DefaultUserPredicate(graphOwner, Arrays.asList("writeAuth1", "writeAuth2"))))
.build();
```

=== "JSON"

```json
{
"class": "uk.gov.gchq.gaffer.federated.simple.operation.AddGraph",
"graphConfig": {
"graphId": "myGraph"
},
"schema": {
"entities": {},
"edges": {},
"types": {}
},
"properties": {
"gaffer.store.class": "uk.gov.gchq.gaffer.accumulostore.AccumuloStore",
"gaffer.store.properties.class": "uk.gov.gchq.gaffer.accumulostore.AccumuloProperties",
"gaffer.cache.service.class": "uk.gov.gchq.gaffer.cache.impl.HashMapCacheService"
},
"owner": "graphOwner",
"isPublic": true,
"readPredicate": {
"class": "uk.gov.gchq.gaffer.access.predicate.AccessPredicate",
"userPredicate": {
"class": "uk.gov.gchq.gaffer.access.predicate.user.DefaultUserPredicate",
"creatingUserId": "graphOwner",
"auths": [ "readAuth1", "readAuth2" ]
}
},
"writePredicate": {
"class": "uk.gov.gchq.gaffer.access.predicate.AccessPredicate",
"userPredicate": {
"class": "uk.gov.gchq.gaffer.access.predicate.user.DefaultUserPredicate",
"creatingUserId": "graphOwner",
"auths": [ "writeAuth1", "writeAuth2" ]
}
}
}
```

## Public and Private Graphs

Graphs added to a federated store can have a `isPublic` field added to them.
This field controls if the added graph is public which means all users can
submit requests to this graph from the federated store. A public graph will
essentially ignore any read predicate applied to it assuming all users can
see at least some data in the graph. Even if a graph is public restrictions
on the data inside it will still apply.

If `isPublic` has been set to `false` the graph will be added as private.
A private graph will check the specified read predicate to ensure the user
has access before running a query.

!!! note
A federated store can be configured to disallow any public graphs from being
added, please see the [store properties](./configuration.md#store-properties)
for more details.

## Read and Write Access

As previously mentioned read/write access can be applied to graphs added to
federated stores.

!!! warning "Please be aware"
Reading from a graph is assumed to be running any operation on the
respective graph, this includes operations such as, `AddElements` etc. Write
access to the graph is required for modifying how it is stored in the
federated store, for example, deleting or renaming the graph.

### Access Control Predicates

To determine if a user has access to read or write, a predicate can be
specified that will be checked before any operation related to the graph is
executed.

All predicates are passed through by specifying them as the `userPredicate` in
the constructor of an `AccessPredicate`. Some default predicates are available
and are as follows however, if you wish to write your own predicate it must
implement Java's [`Predicate<User>`](https://docs.oracle.com/javase/8/docs/api/java/util/function/Predicate.html)
interface.

- `DefaultUserPredicate` - Can be used to define a list of auth strings a user
must have to satisfy the predicate. This will also pass if the user matches the
`creatingUserId` the predicate was initialised with (this does not have the be
the same as the graph owner).
- `NoAccessUserPredicate` - Will always deny any access if used.
- `UnrestrictedAccessUserPredicate` - Will always permit access if used.
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Additional Information on Simple Federation

This page contains additional information and considerations
an admin may need to know when using the federated store type.

## How are Operations Handled?

Gaffer operations are handled quite differently when using the federated store.
The general usage is that the operation submitted to the store will be forwarded
to the sub graph for execution. This means a user can typically use a federated
store like they would a normal store by submitting the same operation chains you
would use on any other store.

A user has control of some aspects of federation using the options passed to the
operation. These can be used to do things like pick graphs or control the
merging, a full list of the available options are outlined in the following
table:

| Option | Description |
| --- | --- |
| `federated.graphIds` | List of graph IDs to submit the operation to, formatted as a comma separated string e.g. `"graph1,graph2"` |
| `federated.excludedGraphIds` | List of graph IDs to exclude from the query. If this is set any graph IDs on a `federated.graphIds` option are ignored and instead, all graphs are executed on except the ones specified e.g. `"graph1,graph2"` |
| `federated.aggregateElements` | Should the element aggregator be used when merging element results. |
| `federated.forwardChain` | Should the whole operation chain be sent to the sub graph or not. If set to `false` each operation will inside the chain will be sent separately, so merging from each graph will happen after each operation instead of at the end of the chain. This will be inherently slower if turned off so is `true` by default. |

Along with the options above, all merge classes can be overridden per query
using the same property key as you would via the store properties. Please see
the table [here](./configuration.md#store-properties) for more information.

If you wish to submit different operations to different graphs in the same query
you can do this using the `federate.forwardChain` option. By setting this to
false on the outer operation chain the options on the operations inside it will
be honoured. An example of this can be seen below:

!!! note
This will turn off any merging of the results at the end of the chain, the
operation chain will act like a standard chain where each operations output
is now the input of the next operation. However, merging will still happen
on each operation if more than one graph is specified for it.

!!! example ""
This seeds for an entity from one graph and adds it into another graph.

```json
{
"class": "OperationChain",
"options": {
"federated.forwardChain": false
},
"operations": [
{
"class": "GetElements",
"options": {
"federated.graphIds": "graph1"
},
"input": [
{
"class": "EntitySeed",
"vertex": "1"
}
]
},
{
"class": "AddElements",
"options": {
"federated.graphIds": "graph2"
}
}
]
}
```

## Cache Considerations

The federated store utilises the [Gaffer cache](../store-guide.md#caches) to store
graphs that have been added to the store. This means all features available to
normal caches are also available to the graph storage, allowing the sharing and
persisting of graphs between instances.

The federated store will use the default cache service to store graphs in. It
will also add a standard suffix meaning if you want to share graphs you will
need to set this to something other than the graph ID (see [here](../store-guide.md#cache-service)).

## Schema Compatibility

When querying multiple graphs, the federated store will attempt to merge each graph's schema together. This means the schemas will need to be
compatible in order to query across them. Generally you will need to ensure
any shared groups can be merged correctly, a few examples of criteria to
consider are:

- Any properties in a shared group defined in both schemas need to have the same
type and aggregation function.
- Any visibility properties need to be compatible or they will be removed from the
schema.
- Groups with different properties in each schema will be merged so the group has
all the properties in the merged schema.
- Any groupBy definitions need to be compatible or will be removed.
Loading