Skip to content

Commit

Permalink
add max_evaluation_depth parameter and documentation in README
Browse files Browse the repository at this point in the history
  • Loading branch information
muddymudskipper committed Jun 17, 2024
1 parent c5f12e4 commit 5a3f01f
Show file tree
Hide file tree
Showing 5 changed files with 160 additions and 44 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/) and this p

### Added

- initial version
- Added ability to specify a custom max-evaluation-depth introduced with pySHACL 0.26.0


97 changes: 97 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,100 @@ Validate your Knowledge Graphs based on tests generated from SHACL shapes.
- Use [pre-commit](https://pre-commit.com/) to avoid errors before commit.
- This repository was created with [this copier template](https://github.com/eccenca/cmem-plugin-template).

## Options

### Data graph URI

The URI of the data graph to be validated. The graph URI is selected from a list of graphs of types:
- `di:Dataset`
- `dsm:ThesaurusProject`
- `owl:Ontology`
- `shui:ShapeCatalog`
- `void:Dataset`

### SHACL graph URI

The URI of the graph containing the SHACL shapes to be validated against. The graph URI is selected from a list of graphs of type `shui:ShapeCatalog`

### Generate validation graph

If enabled, the validation graph is posted to the CMEM instance with the graph URI specified with the *validation graph URI* option. Default value: *false*

### Validation graph URI

If the *generate validation graph* option is enabled the validation graph is posted to the CMEM instance with this graph URI

### Output entities

If enabled, the plugin outputs the validation results and can be connected to, for instance, a CSV dataset to produce a results table. Default value: *false*

### Clear validation graph

If enabled, the validation graph is cleared before workflow execution. Default value: *true*.

## Advanced Options

### Resolve owl:imports

If enabled, the graph tree defined with `owl:imports` in the data graph is resolved. Default value: *true*

### Blank node skolemization

If enabled, blank nodes in the validation graph are skolemized into URIs. Default value: *true*

### Add labels

If enabled, `rdfs:label` triples are added to the validation graph for instances of `sh:ValidationReport` and `sh:ValidationResult`. Default value: *true*

### Add labels from data and SHACL graphs

If enabled along with the *add labels* option, `rdfs:label` triples are added for the focus nodes, values and SHACL shapes in the validation graph. The labels are taken from the specified data and SHACL graphs. Default value: *false*

### Add shui:conforms flag to focus node resources

If enabled, `shui:conforms false` triples are added to the focus nodes in the validation graph. Default value: *false*

### Meta-SHACL

If enabled, the SHACL shapes graph is validated against the SHACL-SHACL shapes graph before validating the data graph. Default value: *false*

### Ontology graph URI

The URI of a graph containing extra ontological information. RDFS and OWL definitions from this are used to inoculate the data graph. The graph URI is selected from a list of graphs of type `owl:Ontology`

### Inference

If enabled, OWL inferencing expansion of the data graph is performed before validation. Options are *RDFS*, *OWLRL*, *Both*, *None*. Default value: *None*

### Advanced

Enable SHACL Advanced Features. Default value: *false*.

### Maximum evaluation depth

The maximum number of SHACL shapes "deep" that the validator can go before reaching an "endpoint" constraint. Default value: 15


## Parameter Input

In order to set options via the input the following parameter names can be used:

| Option | Name |
|------------------------------------------------|------------------------|
| Data graph URI | data_graph_uri |
| SHACL graph URI | shacl_graph_uri |
| Generate validation graph | generate_graph |
| Validation graph URI | validation_graph_uri |
| Output entities | output_entities |
| Clear validation graph | clear_validation_graph |
| Resolve owl:imports | owl_imports |
| Blank node skolemization | skolemize |
| Add labels | add_labels |
| Add labels from data and SHACL graphs | include_graphs_labels |
| Add shui:conforms flag to focus node resources | add_shui_conforms |
| Meta-SHACL | meta_shacl |
| Ontology graph URI | ontology_graph_uri |
| Inference | inference |
| Advanced | advanced |
| Maximum evaluation depth | max_evaluation_depth |

55 changes: 36 additions & 19 deletions cmem_plugin_pyshacl/plugin_pyshacl.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
from cmem_plugin_base.dataintegration.plugins import WorkflowPlugin
from cmem_plugin_base.dataintegration.types import (
BoolParameterType,
IntParameterType,
StringParameterType,
)
from cmem_plugin_base.dataintegration.utils import setup_cmempy_user_access
Expand Down Expand Up @@ -170,7 +171,7 @@ def langfilter(lbl: Literal) -> bool: # noqa: ARG001
param_type=BoolParameterType(),
name="clear_validation_graph",
label="Clear validation graph",
description="If enabled, the validation graph is cleared before workflow " "execution.",
description="If enabled, the validation graph is cleared before workflow execution.",
default_value=True,
),
PluginParameter(
Expand Down Expand Up @@ -292,31 +293,42 @@ def langfilter(lbl: Literal) -> bool: # noqa: ARG001
default_value=False,
advanced=True,
),
PluginParameter(
param_type=IntParameterType(),
name="max_validation_depth",
label="specify a custom max-evaluation-depth",
description="specify a custom max-evaluation-depth. If you find yourself with a "
"legitimate use case, and you are certain you need to increase this limit, and you are "
"cetain you know what you are doing.",
default_value=15,
advanced=True,
),
],
)
class ShaclValidation(WorkflowPlugin):
"""Plugin class"""

def __init__( # noqa: PLR0913
self,
data_graph_uri: str,
shacl_graph_uri: str,
ontology_graph_uri: str,
generate_graph: bool,
validation_graph_uri: str,
output_entities: bool,
clear_validation_graph: bool,
owl_imports: bool,
skolemize: bool,
add_labels: bool,
include_graphs_labels: bool,
add_shui_conforms: bool,
meta_shacl: bool,
inference: str,
advanced: bool,
remove_dataset_graph_type: bool,
remove_thesaurus_graph_type: bool,
remove_shape_catalog_graph_type: bool,
data_graph_uri: str = "",
shacl_graph_uri: str = "",
ontology_graph_uri: str = "",
generate_graph: bool = False,
validation_graph_uri: str = "",
output_entities: bool = False,
clear_validation_graph: bool = True,
owl_imports: bool = True,
skolemize: bool = True,
add_labels: bool = True,
include_graphs_labels: bool = False,
add_shui_conforms: bool = False,
meta_shacl: bool = False,
inference: str = "none",
advanced: bool = False,
remove_dataset_graph_type: bool = False,
remove_thesaurus_graph_type: bool = False,
remove_shape_catalog_graph_type: bool = False,
max_validation_depth: int = 15,
) -> None:
self.data_graph_uri = data_graph_uri
self.shacl_graph_uri = shacl_graph_uri
Expand All @@ -336,6 +348,7 @@ def __init__( # noqa: PLR0913
self.remove_dataset_graph_type = remove_dataset_graph_type
self.remove_thesaurus_graph_type = remove_thesaurus_graph_type
self.remove_shape_catalog_graph_type = remove_shape_catalog_graph_type
self.max_validation_depth = max_validation_depth

discover_plugins("cmem_plugin_pyshacl")
this_plugin = Plugin.plugins[0]
Expand Down Expand Up @@ -604,6 +617,9 @@ def check_parameters( # noqa: C901 PLR0912
if self.inference not in ("none", "rdfs", "owlrl", "both"):
raise ValueError("Invalid value for inference parameter")

if not isinstance(self.max_validation_depth, int) and self.max_validation_depth < 1:
raise ValueError("Invalid value for maximum evaluation depth")

self.log.info("Parameters OK:")
for param in self.graph_parameters + self.bool_parameters:
self.log.info(f"{param}: {self.__dict__[param]}")
Expand Down Expand Up @@ -658,6 +674,7 @@ def execute( # noqa: C901 PLR0912
meta_shacl=self.meta_shacl,
inference=self.inference,
advanced=self.advanced,
max_validation_depth=self.max_validation_depth,
inplace=True,
)
self.log.info(f"Finished SHACL validation in {e_t(start)} seconds")
Expand Down
48 changes: 24 additions & 24 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions tests/test_pyshacl.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ def test_workflow_execution(_setup: None) -> None: # noqa: PT019
remove_dataset_graph_type=True,
remove_thesaurus_graph_type=True,
remove_shape_catalog_graph_type=True,
max_validation_depth=15,
)
plugin.execute(inputs=(), context=TestExecutionContext())
res = get(VALIDATION_GRAPH_URI)
Expand Down

0 comments on commit 5a3f01f

Please sign in to comment.