-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add logical iterable and describe natural mappings #144
Open
pmaria
wants to merge
6
commits into
main
Choose a base branch
from
definitions
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
e5e9f08
define expression evaluation result
pmaria 41f84a0
clean up term map section
pmaria 60d8079
introduce logical iterable
pmaria 81efbbf
describe natural rdf mappings
pmaria fedc99e
Update spec/docs/datatypeConversion.md
pmaria ed8e40e
make multiplicity of reference formulation property 0 or 1
pmaria File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
# Datatype conversions | ||
|
||
For each [=reference formulation=] there may be a set of defined <dfn data-lt="natural mapping">natural RDF mappings</dfn> that are applied to the [=expression evaluation results=] on the [=data source=]. These [=natural mappings=] are defined in the [[RML-IO-Registry]] and are used to convert the values of the [=expression evaluation result=] to the appropriate [=natural RDF literal=] corresponding with the [=reference formulation=]. | ||
|
||
## Natural mapping of source values | ||
|
||
The <dfn>natural RDF literal</dfn> is a [=literal=] that is the result of applying a [=natural mapping=] on a value of a [=data source=], which produces a [=literal=] that is the most appropriate representation of the value in RDF. The [=natural RDF literal=] has a [=natural RDF lexical form=]. | ||
|
||
The <dfn>natural RDF lexical form</dfn> produces only the [=lexical form=] of the [=literal=] and recommends that implementations SHOULD apply the [=XSD canonical mapping=], making it a [=canonical RDF lexical form=]. It is used in RML when non-string [=expression evaluation results=] are used in a string context, for example when a timestamp is used in an [=template-valued term map=] with [=term type=] [=IRI=]. | ||
|
||
The <dfn>canonical RDF lexical form</dfn> produces only the [=lexical form=] of the [=literal=] and requires that the [=XSD canonical mapping=] MUST be applied. | ||
|
||
<dfn>Cast to string</dfn> is an implementation-dependent function that maps values from [=expression evaluation results=] to equivalent Unicode strings. The specifics of [=cast to string=] per [=reference formulation=] are defined in the [[RML-IO-Registry]]. | ||
|
||
Additionally, the [=natural mapping=] determines the [=natural RDF datatype=] of the [=literal=]. | ||
|
||
The <dfn>natural RDF datatype</dfn> is the [=datatype=] corresponding to the [=natural RDF literal=] that is the result of the [=natural mapping=]. The [=natural RDF datatype=] is an [=IRI=] that represents the [=datatype=] of the value in RDF. | ||
|
||
## Datatype-override mapping of source values | ||
|
||
The <dfn>datatype-override RDF literal</dfn> corresponding to an [=expression evaluation result=] value `v` and a [=datatype IRI=] `dt`, is a [=literal=] whose [=lexical form=] is the [=natural RDF lexical form=] corresponding to `v`, and whose [=datatype IRI=] is `dt`. If the [=literal=] is [=ill-typed=], then a [=data error=] is raised. | ||
|
||
A [=literal=] is <dfn data-lt="ill-typed literal">ill-typed</dfn> in RML if its [=datatype IRI=] denotes a [=validatable RDF datatype=] and its [=lexical form=] is not in the [=lexical space=] of the [=RDF datatype=] identified by its [=datatype IRI=]. | ||
|
||
The set of <dfn>validatable RDF datatypes</dfn> includes all [=datatypes=] in the RDF datatype column of [[[#table-lexical-forms]]], as defined in [[XMLSCHEMA11-2]]. This set MAY include implementation-defined additional RDF datatypes. | ||
|
||
For example, `"X"^^xsd:boolean` is [=ill-typed=] because `xsd:boolean` is a validatable [=RDF datatype=] in RML, and `"X"` is not in the [=lexical space=] of `xsd:boolean` [[XMLSCHEMA11-2]]. | ||
|
||
<section class="informative"> | ||
<h2>Summary of XSD Lexical Forms</h2> | ||
|
||
The [=natural mappings=] make reference to various [=XSD datatypes=] and require that values from [=expression evaluation results=] be converted to strings that are appropriate as [=lexical forms=] for these [=datatypes=]. This subsection gives examples of these [=lexical forms=] in order to aid implementers of the mappings. This subsection is non-normative; the normative definitions of the [=lexical spaces=] as well as the [=canonical mappings=] are found in [[XMLSCHEMA11-2]]. | ||
|
||
A general approach that may be used for implementing the natural mappings is as follows: | ||
|
||
1. Identify the source datatype of value of the [=expression evaluation result=] on the [=data source=]. | ||
1. Look up its corresponding [=natural RDF datatype=] for the [=reference formulation=] in the [[RML-IO-Registry]]. | ||
1. Apply [=cast to string=] to the value. | ||
1. Ensure that the resulting string is in the [=lexical space=] of the target [=RDF datatype=]; that is, it must be in a form such as those listed in either column of [[[#table-lexical-forms]]] below. This may require some transformations of the string, in particular for `xsd:hexBinary`, `xsd:dateTime` and `xsd:boolean`. | ||
1. If the goal is to obtain a [=canonical RDF lexical form=], then further string transformations may be required to obtain a form such as those listed in the Canonical lexical forms column of [[[#table-lexical-forms]]] below. | ||
|
||
<table class="numbered" id="table-lexical-forms"> | ||
<caption>Table of canonical and non-canonical lexical forms for some XSD datatypes</caption> | ||
<tbody> | ||
<tr> | ||
<th>RDF datatype</th> | ||
<th>Non-canonical lexical forms</th> | ||
<th>Canonical lexical forms</th> | ||
<th>Comments</th> | ||
</tr> | ||
<tr> | ||
<td><code><a href="https://www.w3.org/TR/xmlschema11-2/#hexBinary">xsd:hexBinary</a></code></td> | ||
<td><code>5232524d4c</code></td> | ||
<td><code>5232524D4C</code></td> | ||
<td>Convert from SQL by applying <a href="https://www.w3.org/TR/xmlschema11-2/#hexBinary"><code>xsd:hexBinary</code> lexical mapping</a>.</td> | ||
</tr> | ||
<tr> | ||
<td rowspan="4"><code><a href="https://www.w3.org/TR/xmlschema11-2/#decimal">xsd:decimal</a></code></td> | ||
<td><code>.224</code></td> | ||
<td><code>0.224</code></td> | ||
<td rowspan="4"></td> | ||
</tr> | ||
<tr> | ||
<td><code>+001</code></td> | ||
<td><code>1</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>42.0</code></td> | ||
<td><code>42</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>-5.9000</code></td> | ||
<td><code>-5.9</code></td> | ||
</tr> | ||
<tr> | ||
<td rowspan="3"><code><a href="https://www.w3.org/TR/xmlschema11-2/#integer">xsd:integer</a></code></td> | ||
<td><code>-05</code></td> | ||
<td><code>-5</code></td> | ||
<td rowspan="3"></td> | ||
</tr> | ||
<tr> | ||
<td><code>+333</code></td> | ||
<td><code>333</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>00</code></td> | ||
<td><code>0</code></td> | ||
</tr> | ||
<tr> | ||
<td rowspan="5"><code><a href="https://www.w3.org/TR/xmlschema11-2/#double">xsd:double</a></code></td> | ||
<td><code>-5.90</code></td> | ||
<td><code>-5.9E0</code></td> | ||
<td rowspan="5">Also supports <code>INF</code>, <code>-INF</code>, <code>NaN</code> and <code>-0.0E0</code>,<br>but these do not appear in standard SQL.</td> | ||
</tr> | ||
<tr> | ||
<td><code>+0.00014770215000</code></td> | ||
<td><code>1.4770215E-4</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>+01E+3</code></td> | ||
<td><code>1.0E3</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>100.0</code></td> | ||
<td><code>1.0E2</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>0</code></td> | ||
<td><code>0.0E0</code></td> | ||
</tr> | ||
<tr> | ||
<td rowspan="2"><code><a href="https://www.w3.org/TR/xmlschema11-2/#boolean">xsd:boolean</a></code></td> | ||
<td><code>1</code></td> | ||
<td><code>true</code></td> | ||
<td rowspan="2">Must be lowercase.</td> | ||
</tr> | ||
<tr> | ||
<td><code>0</code></td> | ||
<td><code>false</code></td> | ||
</tr> | ||
<tr> | ||
<td><code><a href="https://www.w3.org/TR/xmlschema11-2/#date">xsd:date</a></code></td> | ||
<td></td> | ||
<td><code>2011-08-23</code></td> | ||
<td>Dates in SQL don't have timezone offsets.<br>They are optional in XSD.</td> | ||
</tr> | ||
<tr> | ||
<td rowspan="3"><code><a href="https://www.w3.org/TR/xmlschema11-2/#time">xsd:time</a></code></td> | ||
<td><code>22:17:34.885+00:00</code></td> | ||
<td><code>22:17:34.885Z</code></td> | ||
<td rowspan="3">May or may not have timezone offset.</td> | ||
</tr> | ||
<tr> | ||
<td><code>22:17:34.000</code></td> | ||
<td><code>22:17:34</code></td> | ||
</tr> | ||
<tr> | ||
<td><code>22:17:34.1+01:00</code></td> | ||
<td><code>22:17:34.1+01:00</code></td> | ||
</tr> | ||
<tr> | ||
<td><code><a href="https://www.w3.org/TR/xmlschema11-2/#dateTime">xsd:dateTime</a></code></td> | ||
<td><code>2011-08-23T22:17:00.000+00:00</code></td> | ||
<td><code>2011-08-23T22:17:00Z</code></td> | ||
<td>May or may not have timezone offset.<br>Convert from SQL by replacing space wiht "<code>T</code>".</td> | ||
pmaria marked this conversation as resolved.
Show resolved
Hide resolved
|
||
</tr> | ||
</tbody> | ||
</table> | ||
|
||
</section> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,17 @@ | ||
# Defining Logical Sources | ||
# Defining Logical Iterables and Logical Sources | ||
|
||
A <dfn>logical source</dfn> is an abstract construct to describe data access and iteration for a [=data source=] such that it can be mapped to [=RDF triples=]. | ||
A <dfn>logical iterable</dfn> is an abstract construct to describe data access and iteration for a [=data source=]. | ||
|
||
A [=logical source=] (`rml:LogicalSource`) MUST have: | ||
A [=logical iterable=] (`rml:LogicalIterable`) MUST have: | ||
* exactly one `rml:referenceFormulation` property, whose value is a <dfn>reference formulation</dfn> which defines how the underlying [=data source=] is to be accessed, and which [=expressions=] can be evaluated on [=logical iterations=], | ||
* zero or one `rml:iterator` property, whose value is a <dfn data-lt="iterator">logical iterator</dfn> that defines a sequence of [=logical iterations=] on the [=data source=]. If no [=iterator=] is provided, a <dfn class="lint-ignore">default iterator</dfn> MUST be associated with the [=reference formulation=]. | ||
pmaria marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
A <dfn data-lt="iteration">logical iteration</dfn> is an item in the sequence produced by the [=logical source=], on which [=expressions=] can be evaluated. | ||
A <dfn data-lt="iteration">logical iteration</dfn> is an item in the sequence produced by the [=logical iterable=], on which [=expressions=] can be evaluated. | ||
|
||
A <dfn>data source</dfn> is an abstract concept that represents a source of data that can be accessed via a [=logical source=]. A [=data source=] can be a file, a database, a web service, or any other source of data. | ||
A <dfn>data source</dfn> is an abstract concept that represents a source of data that can be accessed via a [=logical iterable=]. A [=data source=] can be a file, a database, a web service, or any other source of data, depending on the type of [=logical iterable=]. | ||
|
||
<aside class="note"> | ||
There can be many different types of [=reference formulation=]. The known types, and the details of how a reference formulation is handled and implemented for each data format, are specified in [[RML-IO-Registry]]. | ||
</aside> | ||
|
||
A <dfn>logical source</dfn> (`rml:LogicalSource`) is a sub class of [=logical iterable=] that can be associated with a [=triples map=] such that a [=data source=] can be mapped to [=RDF triples=]. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it recommend to apply the XSD canonical mapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This follows how it is defined in R2RML. Something to discuss in the next meeting perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have the feeling the phrasing is a bit off, I'd suggest "The natural RDF lexical form is the [=lexical form=] of the [=literal=]. Implementations SHOULD apply the [=XSD canonical mapping=], making it a [=canonical RDF lexical form=]."