Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to Web APIs requiring parameters #24

Open
tirrolo opened this issue Mar 19, 2024 · 0 comments
Open

Access to Web APIs requiring parameters #24

tirrolo opened this issue Mar 19, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request working group?

Comments

@tirrolo
Copy link
Contributor

tirrolo commented Mar 19, 2024

Context

Provided that LV allow us to express constraints inherited from relational databases theory (at least, in the form of annotations), and that these are useful in those scenarios where access to data is constrained by binding patterns [1], being web-apis a notable example, we were wondering whether it would be worth adding an annotation signaling that certain fields need to be provided as arguments to the underlying API. In other words, fields could allow for the definition of "parametric" mappings. Pagination is a particular case of this.

[1] Michael Benedikt, Julien Leblay, Balder ten Cate, Efthymia Tsamoura: Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation. Synthesis Lectures on Data Management, Morgan & Claypool Publishers 2016, ISBN 978-3-031-00728-6

Example scenario

  • map university courses listed by an API that require the lecturer ID in input
  • all possible lecturer IDs occur (foreign key FK) in a CSV file listing the teaching staff
  • RML processor may leverage FKs along with API and CSV information in order to fetch data for all available courses when materializing the RDF graph.

Example mapping:

Headers:

@prefix rml: <http://w3id.org/rml/> .
@prefix td: <https://www.w3.org/2019/wot/td#> .
@prefix htv: <http://www.w3.org/2011/http#> .
@prefix hctl: <https://www.w3.org/2019/wot/hypermedia#> .
@prefix csvw: <http://www.w3.org/ns/csvw> .
@prefix hydra: <http://www.w3.org/ns/hydra/core#> .
@prefix ex: <http://www.example.com/> .
@base <http://example.com/ns#> .

First source (non-parametric):

# RML logical source and logical view for a CSV file listing academic staff:
#
#  ID;NAME;SURNAME;POSITION;EMAIL
#  113541;Alice;Doe;teaching staff;[email protected]
#  ...

<#CSVLogicalSource> a rml:LogicalSource;
  rml:source [ a rml:Source, csvw:Table;
    csvw:url "file:///path/to/list_of_professors.csv";
    csvw:dialect [ a csvw:Dialect;
      csvw:delimiter ";";
      csvw:encoding "UTF-8";
      csvw:header "1"^^xsd:boolean
    ]
  ];
  rml:referenceFormulation rml:CSV.

<#CSVLogicalView> a rml:LogicalView;
  rml:onLogicalSource <#CSVLogicalSource>;
  rml:field [
    rml:fieldName "id" ;
    rml:reference "ID";
  ].
 rml:structuralAnnotation [
    a rml:PrimaryKeyAnnotation; <#CSVLogicalView>
    rml:onFields ("id")
 ].

RML logical source and logical view for an API looking up courses taught by a given lecturer in the university DB:

# example request: https://api.rmluniversity.edu/courses?lecturer=113541
# example response:
# {
#   "courses": [{
#     "code": "CS1234",
#     "name": "Introduction to Databases",
#     "lecturer_id": 113541
#   }, {
#     "code": "CS1237",
#     "name": "Conceptual Modeling",
#     "lecturer_id": 113541
#   }]
# }
#

<#APILogicalSource> a rml:LogicalSource;
  rml:source [ a rml:Source, td:Thing;
    td:hasPropertyAffordance [
      td:hasUriTemplateSchema "https://api.rmluniversity.edu/courses?lecturer={lecturer_id}";  # need parameter lecturer_id, should state this formally!
      td:hasForm [ a hctl:Form;  # hctl:Form = hydra:Operation, hence here we can also put <#APIHydraSpecGetCourseOperation> (see later)
        hctl:forContentType "application/json";
        htv:methodName "GET";
        htv:headers ([
          htv:fieldName "Accept";
          htv:fieldValue "application/json"
        ])
      ]
    ]
  ];
  rml:referenceFormulation rml:JSONPath;
  rml:iterator "$.courses[*]".

The logical source above, can produce results only if values of lecturer_id are provided. But how can the RML processor know where to find these values? Our proposal, similar in spirit to [1], is to exploit inclusions stated as structural annotations within logical views. See logical view below:

<#APILogicalView> a rml:LogicalView;
  rml:onLogicalSource <#APILogicalSource>;
  rml:field [
    rml:fieldName "code";
    rml:reference "$.code"
  ];
  rml:field [
    rml:fieldName "name";
    rml:reference "$.name"
  ];
  rml:field [
    rml:fieldName "lecturer_id";
    rml:reference "$.lecturer_id"
  ];
  rml:structuralAnnotation [ 
    a rml:ForeignKeyAnnotation;  # This states that all 'lecturer_id' values here occurs in field 'id' of <#CSVLogicalView>
    rml:onFields ("lecturer_id");
    rml:targetView <#CSVLogicalView>;
    rml:targetFields ("id")
  ].

Note the rml:ForeignKeyAnnotation stating that all values of lecturer_id are also id in the CSV. The RML processor, thus, can devise a plan to populate the graph: extracting all the id values from the CSV, and then feeding them to the web API.

We complete the example with expression maps using the logical views above.

#
# RML mappings instantiating courses with their name and lecturer.
#

<#Course> a rml:TriplesMap;
  rml:logicalSource <#APILogicalView>;
  rml:subjectMap [
    rml:template "http://kg.rmluniversity.edu/course/{code}";
    rml:class ex:Course
  ];
  rml:predicateObjectMap [
    rml:predicate ex:name;
    rml:objectMap [
      rml:reference "name";
      rml:datatype xsd:string
    ]
  ];
  rml:predicateObjectMap [
    rml:predicate ex:lecturer;
    rml:objectMap [
      rml:parentTriplesMap <#CourseLecturer>
    ]
  ].

<#CourseLecturer> a rml:TriplesMap;
  rml:logicalSource <#APILogicalView>;
  rml:subjectMap [
    rml:template "http://kg.rmluniversity.edu/professor/{lecturer_id}";
    rml:class ex:Lecturer
  ].

Variant: Using Hydra

In the example above, we have used a notation with curly braces to denote a parameter for the API (following the mechanism provided by the td:hasUriTemplateSchema property). Probably this could be done more explicitly, for instance, by using Hydra:

#
# EXTRA: possible (partial) definition of API operation and IRI template using Hydra
#

<#APIHydraSpecCourseIriTemplate> a hydra:IriTemplate;
  hydra:template "https://api.rmluniversity.edu/courses?lecturer={lecturer_id}";
  hydra:mapping [ 
    a hydra:IriTemplateMapping;
    hydra:variableRepresentation hydra:BasicRepresentation;
    hydra:variable "lecturer_id";
    hydra:property "lecturer_id"; # here we want to formally map variable {lecturer_id} to field "lecturer_id" and/or reference "$.lecturer_id"
    hydra:required true;
  ];
  hydra:operation <#APIHydraSpecGetCourseOperation>.

<#APIHydraSpecGetCourseOperation> a hydra:Operation;
  hydra:method "GET";
  hydra:returns: hydra:Collection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request working group?
Projects
None yet
Development

No branches or pull requests

2 participants