Definition of language- and data-type maps. #145

chrdebru · 2024-11-21T16:27:28Z

Given the following XML for example:

<tasks>
  <task>
    <task_id>1</task_id>
    <description lang="en">Design Mockups</description>
    <description lang="fr">Concevoir des maquettes</description>
  </task>
  <task>
    <task_id>2</task_id>
    <description lang="en">Develop Frontend</description>
    <description lang="fr">Développer le frontend</description>
  </task>
  <task>
    <task_id>3</task_id>
    <description lang="en">Develop Backend</description>
    <description lang="fr">Développer le backend</description>
  </task>
</tasks>

Iterating over /tasks/task, I want to generate English and French labels from tasks.

      rml:predicate ex:description ;
      rml:objectMap [ rml:reference "description[@lang='en']" ; rml:language "en" ; ] ; 
      rml:objectMap [ rml:reference "description[@lang='fr']" ; rml:language "fr" ; ] ; 
  ] ;

Allows me to do that, but why "hard-code" the languages. The problem is that

      rml:predicate ex:description ;
      rml:objectMap [ rml:reference "description" ; rml:languageMap [ rml:reference "description/@lang" ] ] ;
  ] ;

leads to a Cartesian product of labels and languages. This respects the definition of language-maps (and data-maps, by extension): "Given the list of values resulting from a language-taggable term map T, and the list of values resulting from its language map L, the resulting terms are generated by the n-ary Cartesian product combination of T × L, where the values in T are the lexical forms, and the values in L are the non-empty language tags."

Is this something we want (seems contradictory w.r.t. a seemingly conceivable use case). If not, there is (IMHO) something wrong with the specification, and we likely need some iteration manipulation (as @frmichel once suggested). If not, then the spec should give a concrete example with maybe a note or two.

The text was updated successfully, but these errors were encountered:

chrdebru · 2024-11-30T14:55:10Z

Unless I have missed something, went through the spec and the example above respects the definition. @andimou What is your opinion on this?

pmaria · 2024-12-02T14:19:26Z

This is indeed a nice example that illustrates the issues (that were already there also with templates) when working with hierarchical data and trying to combine data elements respecting the hierarchical context of the data source.

See also the description of this problem in the RML-LV spec https://github.com/kg-construct/rml-lv/blob/main/spec/section/problem.md#nested-data-structures

So, one way to solve this problem would be to use logical views.

I am open to discussing other ways to solve this in core.

chrdebru · 2024-12-02T14:53:42Z

Well, to me, LV solves the problem by "eliminating" the inconveniences of hierarchical docs and multi-valued expression maps by creating the logical equivalent of rows where the "scope" of multi-valued expression maps is nicely defined. My question can be rephrased as follows: do we recognize the proposed definition and its implications (in corner cases) and explicitly acknowledge and document such implications? In other words, do not solve it as it can be handled elsewhere.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Definition of language- and data-type maps. #145

Definition of language- and data-type maps. #145

chrdebru commented Nov 21, 2024

chrdebru commented Nov 30, 2024

pmaria commented Dec 2, 2024

chrdebru commented Dec 2, 2024

Definition of language- and data-type maps. #145

Definition of language- and data-type maps. #145

Comments

chrdebru commented Nov 21, 2024

chrdebru commented Nov 30, 2024

pmaria commented Dec 2, 2024

chrdebru commented Dec 2, 2024