Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improving handling of default reference for text offset #29

Open
kosloot opened this issue Jan 12, 2022 · 1 comment
Open

improving handling of default reference for text offset #29

kosloot opened this issue Jan 12, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@kosloot
Copy link
Collaborator

kosloot commented Jan 12, 2022

When a text content has an offset without a explicit reference, the offset is per definition relative to the text content of the nearest structure parent. In general this is OK, but there are structure elements that MAY NOT carry text.
Notably <table> and <row>, maybe more.
I suggest to extend the search for a suitable parent to the first structure parent that is allowed to carry text.

A simple addition that I already implemented in libfolia.
Sample FoLiA to demonstrate the problem:

<?xml version="1.0" encoding="UTF-8"?>
<FoLiA xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://ilk.uvt.nl/folia" xml:id="tabel" generator="libfolia-v2.10" version="2.5.1">
  <metadata type="native">
    <annotations>
      <paragraph-annotation/>
      <division-annotation />
      <string-annotation/>
      <table-annotation/>
      <text-annotation set="https://raw.githubusercontent.com/proycon/folia/master/setdefinitions/text.foliaset.ttl"/>
    </annotations>
  </metadata>
  <text xml:id="tabel.text">
    <div xml:id="tabel.text.div.1">
      <t>rij 1 veld 1</t>
      <table xml:id="tabel.">
        <row xml:id="tabel.row.1">
          <cell xml:id="tabel.row.1.cell.1">
            <t offset="0">rij 1 veld 1</t>
	  </cell>
        </row>
      </table>
    </div>
  </text>
</FoLiA>

The most recent folialint from libfolia approves this.

But the current foliavalidator states:

EXT VALIDATION ERROR: Text for Cell, ID tabel.row.1.cell.1, textclass current, has incorrect offset 0 or invalid reference: Reference (ID tabel.row.1) has no such text (class=current)
(also checked against older rules prior to FoLiA v2.4.1)
VALIDATION ERROR on full parse by library (stage 2/3), in cell-offset-bug.xml
UnresolvableTextContent: Reference (ID tabel.row.1) has no such text (class=current)
@proycon
Copy link
Owner

proycon commented Mar 26, 2024

I think that's a good solution for these edge cases yes.

Moving this to issue foliapy

@proycon proycon transferred this issue from proycon/foliatools Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants