You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Provided that LV allow us to express constraints inherited from relational databases theory (at least, in the form of annotations), and that these are useful in those scenarios where access to data is constrained by binding patterns [1], being web-apis a notable example, we were wondering whether it would be worth adding an annotation signaling that certain fields need to be provided as arguments to the underlying API. In other words, fields could allow for the definition of "parametric" mappings. Pagination is a particular case of this.
[1] Michael Benedikt, Julien Leblay, Balder ten Cate, Efthymia Tsamoura: Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation. Synthesis Lectures on Data Management, Morgan & Claypool Publishers 2016, ISBN 978-3-031-00728-6
Example scenario
map university courses listed by an API that require the lecturer ID in input
all possible lecturer IDs occur (foreign key FK) in a CSV file listing the teaching staff
RML processor may leverage FKs along with API and CSV information in order to fetch data for all available courses when materializing the RDF graph.
# RML logical source and logical view for a CSV file listing academic staff:
#
# ID;NAME;SURNAME;POSITION;EMAIL
# 113541;Alice;Doe;teaching staff;[email protected]
# ...
<#CSVLogicalSource> a rml:LogicalSource;
rml:source [ a rml:Source, csvw:Table;
csvw:url "file:///path/to/list_of_professors.csv";
csvw:dialect [ a csvw:Dialect;
csvw:delimiter ";";
csvw:encoding "UTF-8";
csvw:header "1"^^xsd:boolean
]
];
rml:referenceFormulation rml:CSV.
<#CSVLogicalView> a rml:LogicalView;
rml:onLogicalSource <#CSVLogicalSource>;
rml:field [
rml:fieldName "id" ;
rml:reference "ID";
].
rml:structuralAnnotation [
a rml:PrimaryKeyAnnotation; <#CSVLogicalView>
rml:onFields ("id")
].
RML logical source and logical view for an API looking up courses taught by a given lecturer in the university DB:
# example request: https://api.rmluniversity.edu/courses?lecturer=113541
# example response:
# {
# "courses": [{
# "code": "CS1234",
# "name": "Introduction to Databases",
# "lecturer_id": 113541
# }, {
# "code": "CS1237",
# "name": "Conceptual Modeling",
# "lecturer_id": 113541
# }]
# }
#
<#APILogicalSource> a rml:LogicalSource;
rml:source [ a rml:Source, td:Thing;
td:hasPropertyAffordance [
td:hasUriTemplateSchema "https://api.rmluniversity.edu/courses?lecturer={lecturer_id}"; # need parameter lecturer_id, should state this formally!
td:hasForm [ a hctl:Form; # hctl:Form = hydra:Operation, hence here we can also put <#APIHydraSpecGetCourseOperation> (see later)
hctl:forContentType "application/json";
htv:methodName "GET";
htv:headers ([
htv:fieldName "Accept";
htv:fieldValue "application/json"
])
]
]
];
rml:referenceFormulation rml:JSONPath;
rml:iterator "$.courses[*]".
The logical source above, can produce results only if values of lecturer_id are provided. But how can the RML processor know where to find these values? Our proposal, similar in spirit to [1], is to exploit inclusions stated as structural annotations within logical views. See logical view below:
<#APILogicalView> a rml:LogicalView;
rml:onLogicalSource <#APILogicalSource>;
rml:field [
rml:fieldName "code";
rml:reference "$.code"
];
rml:field [
rml:fieldName "name";
rml:reference "$.name"
];
rml:field [
rml:fieldName "lecturer_id";
rml:reference "$.lecturer_id"
];
rml:structuralAnnotation [
a rml:ForeignKeyAnnotation; # This states that all 'lecturer_id' values here occurs in field 'id' of <#CSVLogicalView>
rml:onFields ("lecturer_id");
rml:targetView <#CSVLogicalView>;
rml:targetFields ("id")
].
Note the rml:ForeignKeyAnnotation stating that all values of lecturer_id are also id in the CSV. The RML processor, thus, can devise a plan to populate the graph: extracting all the id values from the CSV, and then feeding them to the web API.
We complete the example with expression maps using the logical views above.
#
# RML mappings instantiating courses with their name and lecturer.
#
<#Course> a rml:TriplesMap;
rml:logicalSource <#APILogicalView>;
rml:subjectMap [
rml:template "http://kg.rmluniversity.edu/course/{code}";
rml:class ex:Course
];
rml:predicateObjectMap [
rml:predicate ex:name;
rml:objectMap [
rml:reference "name";
rml:datatype xsd:string
]
];
rml:predicateObjectMap [
rml:predicate ex:lecturer;
rml:objectMap [
rml:parentTriplesMap <#CourseLecturer>
]
].
<#CourseLecturer> a rml:TriplesMap;
rml:logicalSource <#APILogicalView>;
rml:subjectMap [
rml:template "http://kg.rmluniversity.edu/professor/{lecturer_id}";
rml:class ex:Lecturer
].
Variant: Using Hydra
In the example above, we have used a notation with curly braces to denote a parameter for the API (following the mechanism provided by the td:hasUriTemplateSchema property). Probably this could be done more explicitly, for instance, by using Hydra:
#
# EXTRA: possible (partial) definition of API operation and IRI template using Hydra
#
<#APIHydraSpecCourseIriTemplate> a hydra:IriTemplate;
hydra:template "https://api.rmluniversity.edu/courses?lecturer={lecturer_id}";
hydra:mapping [
a hydra:IriTemplateMapping;
hydra:variableRepresentation hydra:BasicRepresentation;
hydra:variable "lecturer_id";
hydra:property "lecturer_id"; # here we want to formally map variable {lecturer_id} to field "lecturer_id" and/or reference "$.lecturer_id"
hydra:required true;
];
hydra:operation <#APIHydraSpecGetCourseOperation>.
<#APIHydraSpecGetCourseOperation> a hydra:Operation;
hydra:method "GET";
hydra:returns: hydra:Collection.
The text was updated successfully, but these errors were encountered:
Context
Provided that LV allow us to express constraints inherited from relational databases theory (at least, in the form of annotations), and that these are useful in those scenarios where access to data is constrained by binding patterns [1], being web-apis a notable example, we were wondering whether it would be worth adding an annotation signaling that certain fields need to be provided as arguments to the underlying API. In other words, fields could allow for the definition of "parametric" mappings. Pagination is a particular case of this.
[1] Michael Benedikt, Julien Leblay, Balder ten Cate, Efthymia Tsamoura: Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation. Synthesis Lectures on Data Management, Morgan & Claypool Publishers 2016, ISBN 978-3-031-00728-6
Example scenario
Example mapping:
Headers:
First source (non-parametric):
RML logical source and logical view for an API looking up courses taught by a given lecturer in the university DB:
The logical source above, can produce results only if values of
lecturer_id
are provided. But how can the RML processor know where to find these values? Our proposal, similar in spirit to [1], is to exploit inclusions stated as structural annotations within logical views. See logical view below:Note the
rml:ForeignKeyAnnotation
stating that all values oflecturer_id
are alsoid
in the CSV. The RML processor, thus, can devise a plan to populate the graph: extracting all theid
values from the CSV, and then feeding them to the web API.We complete the example with expression maps using the logical views above.
Variant: Using Hydra
In the example above, we have used a notation with curly braces to denote a parameter for the API (following the mechanism provided by the
td:hasUriTemplateSchema
property). Probably this could be done more explicitly, for instance, by using Hydra:The text was updated successfully, but these errors were encountered: