Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Use case: Ordered lists as an expected data format #18

Open
danielbeeke opened this issue Aug 23, 2023 · 12 comments
Open

Use case: Ordered lists as an expected data format #18

danielbeeke opened this issue Aug 23, 2023 · 12 comments

Comments

@danielbeeke
Copy link
Contributor

We talked in the meeting on the 23th of August about ordered lists:

It is painful to implement ordered lists at the moment. But they give a great aspect of the data; the order. Inside my form renderer I have added a way to sort ordered lists but it was quite painful.

sh:property [
  sh:name "Author reference"@en ;
  sh:path ( schema:author [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ;
]

It would be great if we could have something like this instead:

sh:property [
  sh:name "Author reference"@en ;
  sh:path  schema:author ;
  OUR_NAMESPACE:isOrderedList true ;
]

In my form renderer I did a similar abstraction. When I detect an sh:path that specifies an ordered list for rendering the form I replace it to a normal sh:path and send the fact that is should be an ordered list to dash:editor implementation. I would be nicer to do this in the SHACL.

Questions:

  • What would be a good predicate (alternative for 'isOrderedList')?
  • How have you worked around ordered lists in SHACL? Do you have something to share about this topic?

References

@HolgerKnublauch
Copy link
Contributor

None of the work-arounds are perfect but it remains an important problem. rdf:List is a terrible but necessary solution from the last century.

We have introduced a property dash:index that can be used together with reification/rdf-star. The benefit is that the values remain "normal" triples that can be accessed with set-based semantics, yet can also be queried in order if needed. Properties are marked with dash:indexed true, and constraints check that all indices from 0..N are present. Pain points include the cost if inserts and look-ups still require O(n).

@danielbeeke
Copy link
Contributor Author

@HolgerKnublauch I could not find documentation about dash:index only this part in the ontology.

dash:IndexedConstraintComponent
  a sh:ConstraintComponent ;
  rdfs:comment "A constraint component that can be used to mark property shapes to be indexed, meaning that each of its value nodes must carry a dash:index from 0 to N." ;
  rdfs:label "Indexed constraint component" ;
  sh:parameter dash:IndexedConstraintComponent-indexed ;
.
dash:IndexedConstraintComponent-indexed
  a sh:Parameter ;
  sh:path dash:indexed ;
  dash:reifiableBy dash:ConstraintReificationShape ;
  sh:datatype xsd:boolean ;
  sh:description "True to activate indexing for this property." ;
  sh:maxCount 1 ;
  sh:name "indexed" ;
.

Could you elaborate what reification/rdf-star would do?

@HolgerKnublauch
Copy link
Contributor

HolgerKnublauch commented Aug 23, 2023

Assuming you have 3 children ordered by date of birth, I believe in the current RDF-star syntax draft it would look like

ex:Parent
    ex:child ex:Child1 {| dash:index 0 |} ;
    ex:child ex:Child2 {| dash:index 1 |} ;
    ex:child ex:Child3 {| dash:index 2 |} ;
.

@WilliamChelman
Copy link

In our model, we went for a slightly different approach. For example when defining the sh:PropertyShape for sh:languageIn, we have done something like

<PropertyShape/languageIn>
    a sh:PropertyShape ;
    sh:path sh:languageIn ;
    hanami:listOf [
        sh:datatype xsd:string ;
    ] ;
.

And when going through validation, this sh:PropertyShape is expanded into something that loosely look like this (inspired by the "shapes of shapes" from the SHACL specs)

<PropertyShape/languageIn>
    a sh:PropertyShape ;
    sh:path sh:languageIn ;
    sh:nodeKind sh:BlankNodeOrIRI ; # this is new
    sh:node <GeneratedUri/1> ; # this is new
    hanami:listOf [ # ignored by validation engine, we leave it here
        sh:datatype xsd:string ;
    ] ;
.

<GeneratedUri/1>
    a sh:NodeShape ;
    sh:property [   
        sh:path [ sh:zeroOrMorePath rdf:rest ] ;
	sh:hasValue rdf:nil ;
	sh:node <GeneratedUri/2> ;
    ] 
.

<GeneratedUri/2>
    a sh:NodeShape ;    
    sh:or (
                    [
                    sh:hasValue rdf:nil;
                    sh:property  [ sh:maxCount  0 ;
                                   sh:path      rdf:rest
                                 ] ;
                    sh:property  [ sh:maxCount  0 ;
                                   sh:path      rdf:first
                                 ]
                    ]

                    [
                    sh:not [sh:hasValue rdf:nil];
                    sh:property [
                                  sh:path rdf:first ;
                                  sh:maxCount 1 ;
                                  sh:minCount 1;
                                  sh:datatype xsd:string ; # the content of `hanami:listOf` is copied here
                                ] ;
                    sh:property [
                                  sh:path rdf:rest ;
                                  sh:maxCount 1 ;
                                  sh:minCount 1 ;
                                ]
                    ]

                 )
  .

So that it can be processed by vanilla SHACL validators.

We also considered using something like OUR_NAMESPACE:isOrderedList true, but this would then be wrongly interpreted by vanilla SHACL validators. For example if we had this in a data graph

<TitlePropertyShape>
  a sh:PropertyShape ;
  sh:path ex:title ;
  sh:datatype rdf:langString ;
  sh:languageIn (en fr) ;
.

and as sh:languageIn definition in the shapes graph

<PropertyShape/languageIn>
    a sh:PropertyShape ;
    sh:path sh:languageIn ;
    sh:datatype xsd:string ;
    OUR_NAMESPACE:isOrderedList true ;
.

And we pass those to a validation engine, it will most likely say that <TitlePropertyShape> has some violations because sh:languageIn is not pointing to xsd:string values, but to an rdf list node.

Also this solution works nicely for even more complex scenarios (at least in our use cases 😛), like so

<PropertyShape/or>
    a             sh:PropertyShape ;
    sh:path       sh:or ;
    hanami:listOf [
        sh:nodeKind   sh:BlankNodeOrIRI ;
        sh:or         (
            [
                sh:class sh:PropertyShape ;
            ]
            [
                sh:node <PropertyShape> ;
            ]
        ) ;
    ] ;
.

@danielbeeke
Copy link
Contributor Author

@WilliamChelman that is a nice way. The preprocessing is an elegant workaround.

@tpluscode
Copy link

What I see missing from this discussion is the potential need to cater for viewers and editors, as well as validation.

sh:path ( schema:author [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ; works for viewing. It may be awkward but it can be supported already using existing specs

The proposed [ sh:path schema:author ; OUR_NAMESPACE:isOrderedList true ] has a drawback that if an implementation does not understand OUR_NAMESPACE:isOrderedList it will inevitable render UI for creating a set of schema:author objects.

I think I like the hanami:listOf solution where that property could be used directly by a UI builder and would otherwise be ignored by builders which does not understand it.

sh:property [
  sh:name "Author reference"@en ;
  hanami:listOf [ sh:path schema:author ] ;
]

The above would instruct a builder to render an editing UI which sets an RDF List to schema:author. The UI could be a draggable list of another component (dropdowns, etc) or, in an optimized case, a text area where each line becomes an separate literal

@bergos
Copy link
Member

bergos commented Sep 19, 2023

We could define the shape for lists in our namespace. Then, we would have fixed IRI that can be used to identify lists. UI components can just ignore the content of the shape. Validators don't need to change the logic. It just requires an owl:import.

@HolgerKnublauch
Copy link
Contributor

To identify that a property has rdf:Lists as values, we use

sh:node dash:ListShape

see https://datashapes.org/dash.html#ListShape

With that stable URI, widgets can more easily identify lists than relying on parsing rdf:rest etc.

@bergos
Copy link
Member

bergos commented Sep 30, 2023

I propose we copy the dash:ListShape to the new namespace. Please vote till the 18th of October.

@tfrancart
Copy link
Contributor

dash:ListShape only says : "this is an RDF list", but does not say "this is a list OF WHAT". How do you tell this is a list of xsd:string, as in the provided example by @WilliamChelman ?

@HolgerKnublauch
Copy link
Contributor

Should be easy to define a constraint component with a constraint such as dash:listMemberClass and dash:listMemberDatatype or dash:listMemberType

@bergos
Copy link
Member

bergos commented Oct 16, 2023

Other constraints can be defined with the path ( [ sh:zeroOrMorePath rdf:rest ] rdf:first ) as shown in the example below (source: https://archive.topquadrant.com/constraints-on-rdflists-using-shacl/)

In the last call, I had the idea that we could also define an IRI as the root of the list with the fixed path. But I missed that this would lead to a named node object value, which will be directly interpreted as path.

ex:TrafficLightShape
    a sh:NodeShape ;
    sh:targetClass ex:TrafficLight ;
    sh:property [
        sh:path ex:colors ;
        sh:node dash:ListShape ;
        sh:property [
            sh:path ( [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ;
            sh:datatype xsd:string ;
            sh:minLength 1 ;
            sh:minCount 2 ;
            sh:maxCount 3 ;
        ]
    ] .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants