Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling property object removal when subjects are reused #235

Open
tpluscode opened this issue Aug 30, 2022 · 6 comments
Open

Handling property object removal when subjects are reused #235

tpluscode opened this issue Aug 30, 2022 · 6 comments

Comments

@tpluscode
Copy link
Member

tpluscode commented Aug 30, 2022

This issue is to improve the overall behaviour of the form when a complex value (blank node or IRI) is removed from the graph. Here's a working example

<view>
  ex:source <namedSource>, _:blankSource ;
  view:dimension [
    view:from [
       view:source <namedSource> ;
    ] ;
  ] ;
  view:dimension [
    view:from [
       view:source _:blankSource ;
    ] ;
  ] ;
.

<namedSource>
  view:cube <http://foo.bar/cube1> ;
.

_:blankSource
  view:cube <http://foo.bar/cube2> ;
.

Presently, when there are no constraints on the view:source's property shape, either value can be removed using the form UI.

  1. When removing either, their usages (view:source triples) are kept intact
  2. When removing _:blankSource, its subgraph (view:cube triple) would be removed
  3. When removing <namedSource>, it subgraph would remain dangling, unconnected from the <view>'s subgraph

I found this behaviour inconsistent and propose some changes:

By default, prevent used node from being removed

In the example above, I would prevent the removal of either <namedSource> or _:blankSource as long as they are used as subject in the graph.

The problem here is to separate the "subject usage" (<view> view:source ?source) from "object usage" (?dimension view:from/view:source ?source`). The latter must always be allowed to be remove.

Option1 : keep subgraph

I think this would be the default

One option is to keep the subgraph of the removed node.

<view>
- ex:source <namedSource>, _:blankSource ;
+ ex:source <namedSource> ;
.

<namedSource>
  view:cube <http://foo.bar/cube1> ;
.

+# Nothing happened here
_:blankSource
  view:cube <http://foo.bar/cube2> ;
.

Option 1: remove subgraph

Property shape annotation

[
  sh:path sh:source ;
+ sh1:onRemove sh1:removeSubgraph ;
]

For consistency, I think I would prefer entire subgraph to be removed both for blank nodes, as well as named nodes

<view>
- ex:source <namedSource>, _:blankSource ;
+ ex:source _:blankSource ;
.

+# Remove entire representation of <namedSource>
-<namedSource>
- view:cube <http://foo.bar/cube1> ;
-.

_:blankSource
  view:cube <http://foo.bar/cube2> ;
.

Option 2: remove usages

This would override the default behaviour, allowing the removal of form values which are used as objects

Property shape annotation

[
  sh:path sh:source ;
+ sh1:onRemove sh1:removeUsages ;
]
<view>
-  ex:source _:blankSource ;
  view:dimension [
    view:from [
+      # usage removed
-      view:source _:blankSource ;
    ] ;
  ] ;
.

+# Source kept in the graph
_:blankSource
  view:cube <http://foo.bar/cube2> ;
.
@cristianvasquez
Copy link
Collaborator

cristianvasquez commented Aug 30, 2022

My take is to mark things to be kept,

that would be option 3?

I don't know which predicate to use, but perhaps I would not use the intention in the predicate, like sh1:removeUsages I would go for something that groups, ex:sh1:group?

Example

Supposing forms are always a tree, I would mark all URIs that define partitions of interest.

For example, having:

<alice>  foaf:knows <bob> ;
		<livesIn> <house> ;
		<name> "Alice" ;
		<likes> <icecream> .

<icecream> <flavor> "chocolate" .
	
<bob> <livesIn> <house> ;
	<name> "Bob" .		
		
<house> 
	<address> "Wonderland" .

One can say that <alice>, <bob> and <house> mark entities of interest (to keep).

Looking at this graph as a tree that starts from <alice>, one can generate partitions (or documents) formed while walking through the tree.

Document 1:

<alice>  foaf:knows <bob> ;
		<livesIn> <house> ;
		<name> "Alice" ;
		<likes> <icecream> .

<icecream> <flavor> "chocolate" .

Document 2:

<bob> <livesIn> <house> ;
	<name> "Bob" .	

Document 3:

<house> 
	<address> "Wonderland" .

One can modify any quad anytime, but when <alice> is deleted, all triples in document 1 could be deleted. <bob> and <house>, documents 2 and 3 respectively.

If this is a graph in memory, I would personally use named graphs to differentiate between quads in those documents. 3 named graphs, <alice>, <bob> and <house>. With named graphs becomes trivial to keep track and delete a group of quads.

Another interesting approach is the one that @bergos used in rdf-cube-view-query. A tree structure to 'remember' triples that should be deleted together: https://github.com/zazuko/rdf-cube-view-query/blob/master/lib/Node.js, I think is to handle a similar problem.

@tpluscode
Copy link
Member Author

In practice most forms are indeed trees with a single root but I do not want to make that assumption. Multiple roots would be rare and I personally have not come across such a use case. Graphs, however, do happen. Such that there is a single root node but cycles or edges between tree branches are possible.

I would not like to mix named graphs here. The context is always a single document. The "data graph" in SHACL lingo.

Now that I think about this, maybe the missing piece is to annotate/mark a Property Shape so that objects are "owned" by the Focus Node. In the original example, the view itself owns a source. The view:from/view:source usage does not, it's only a reference.

:ViewShape 
  sh:property [
    sh:path view:source ; 
    sh:node :SourceShape ;
    sh:class view:Source ;
+   # assert ownership as subtype
+   a sh1:ContainmentProperty ;
  ] ;

  sh:property [
    sh:path view:dimension ;
    sh:node :DimensionShape ;
  ] ;
.

:DimensionShape
  sh:property [
    sh:path view:from ;
    sh:node [
+     # no "containment" here, meaning that when removed, only the object usage is removed
      sh:path view:source ;
      sh:class view:Source ;
    ] ;
  ] ;
.

@cristianvasquez
Copy link
Collaborator

Oh, I talked about named graphs as just an implementation choice for the deletion of things for a graph in memory. Not related to SHACL :) . Perhaps I was mixing topics there.

Objects "owned" by the Focus Node look like a good option. It would enable other things as well; for example 'you cannot delete this node because it's owned by this other one."

In your previous example, you don't use the word owned but contained. Why that choice of words?

@tpluscode
Copy link
Member Author

Named graph are a good idea in general. I totally use them similarly, to easily manage "wholeness". But in a form you typically work with a single document IMO

In your previous example, you don't use the word owned but contained. Why that choice of words?

No particular reason. I do lean towards "contain" I think :)

@cristianvasquez
Copy link
Collaborator

cristianvasquez commented Aug 31, 2022

ContainmentProperty suggests the property contains things, which confuses me a little :)

I don't have a clue of what would be a good name, but when you say ownership to delete, this makes me remember the parallel with the SQL Databases, where you can do Cascade Deletion.

As you know, In SQL databases, they enforce this with foreign keys.

A foreign key (FK) is a column or combination of columns that is used to establish and enforce a link between the data in two tables to control the data that can be stored in the foreign key table

There is something like that in SHACL?

@tpluscode
Copy link
Member Author

Ok, I'm pretty convinced. Ownership may be a better fit after all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants