Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid the schema error: element "bibl" incomplete; missing required element "ptr" #256

Open
danbalogh opened this issue Jan 9, 2024 · 9 comments
Assignees
Labels
invalid This doesn't seem right

Comments

@danbalogh
Copy link
Collaborator

The schema now raises an error when a <bibl> element has no <ptr> child. This is all right, but it creates a problem that I think is not unique to my subcorpus. Sometimes, there exists no secondary bibliography and/or no primary bibliography for an inscription (or none is encoded yet), but I would prefer to keep the skeleton of that section in the XML file, in case it can be populated later. My solution so far has been to use

  <listBibl type="primary">
    <bibl/>
  </listBibl>

where the empty <bibl/> element is necessary because without it, the earlier schema also raised an error. But now the above is also flagged as an error. I could imagine the following solutions, but I don't know which if any are most feasible:

  • adding an exception to the schema rule: if empty <bibl/> (or empty <bibl n="siglum"/>, see below) is the only child of a <listBibl>, then the absence of <ptr> is not an error (most convenient for me, since I can keep what I have);
  • creating a special bibliography item (e.g. #bib:NONE) that would mean "no known bibliography" and instead of <bibl/> , encode e.g. <bibl><ptr target="bib:NONE"/></bibl> (I would then replace my empty bibl items with this);
  • keeping the schema as is, and I remove the offending empty bibl elements together with their <listBibl> container.

What do you think, @michaelnmmeyer ?

I should add that I've just checked our inscription templates, and the use of empty <bibl/> (or <bibl n="siglum"/> in case of the primary bibliography) is present there too, so at the moment, even our template is in conflict with the schema.

@danbalogh danbalogh added the invalid This doesn't seem right label Jan 9, 2024
@arlogriffiths
Copy link
Collaborator

I don't think this is new. It has been years that I have been annoyed at having to remove or comment out altogether the <div type="bibliography"> when I do not not have any references to encode or want to postpone that part of the encoding work.

@danbalogh
Copy link
Collaborator Author

You may have had something different? The absence of <bibl/> has long been noted as an error, but I'm sure I've had no schema complaints for the snippet cited above, until recently. Also, the last time I looked through Michael's list of files with encoding errors, I corrected all errors then flagged in my files - and now there are dozens of my files shown as having errors, because of this.
But anyway, whether old or new, we need a way to keep empty bibliographies in the file. Commenting out is also acceptable to me, but I'd like us to agree on the "proper" way.

@michaelnmmeyer
Copy link
Member

For cases like that, I am in favor of the "comment things out" option.

The main problem is that TEI grammars do not allow you to express context-dependent rules. You cannot say, for instance, that an element must have an attribute X in some context and an attribute Y in another. You need to allow both attributes and add extra code later on to sort things out.

This is sometimes inevitable, but it is better avoided whenever possible. The more you do it, the more your schema looks like code, the less "declarative" (viz. static, inert, unlike a program) it becomes. This makes it harder to reason about, and this makes the contextual help generated by Oxygen (and the TEI documentation, see e.g. https://dharman.in/documentation/inscription) less useful.

@arlogriffiths All modern editors have keyboard shortcuts to comment/uncomment things. In Oxygen, you have the command "Toggle comment", bound to Ctrl+Shift+Comma per default.

@danbalogh
Copy link
Collaborator Author

I'm not entirely happy with that. We already have contextual rules for the bibliography, e.g. it seems that @n is mandatory in the primary bibliography but not in the secondary. If adding more contextual rules would be too much of a complication, then we should investigate other solutions, e.g. the introduction of a dummy bibliography pointer (which I think has been raised before in a different context, but I cannot recall what). The thing is, I don't like the idea of having to use a template that is in conflict with the schema right from the start. I also don't like the situation where I think nearly half of the "errors" now flagged on https://dharman.in/texts are instances of "element "bibl" incomplete", simply because the encoders who created those files didn't comment out parts of the now-erroneous template. And finally, what if in spite of all these misgivings today I comment out the secondary bibliography as a whole, and then after another improvement next week the schema starts raising an error because the secondary bibliography's presence is now mandatory?

@michaelnmmeyer
Copy link
Member

In this case, it is best to allow <bib> to be empty. A lot of files are using a "John Doe" bibliography entry as a placeholder, by the way. See e.g. https://dharman.in/display/DHARMA_INSSiddham00101.

@danbalogh
Copy link
Collaborator Author

Thanks for spotting this. This must have been what my foggy memory of an earlier occurrence of the dummy bibliography pointer idea was about. Is it used anywhere outside the siddham corpus? Apparently, there I decided to refer to the Zotero ID AuthorYear_01 when the schema showed an error unless a reference was present.

So, @arlogriffiths and @michaelnmmeyer , shall we make this official, mention it in the guides, revise the inscription templates (and other templates as the case may be) accordingly, and make the change (automated as far as possible) in existing XML files?

The empty bibliography in the Siddham files looks like this:

<div type="bibliography">
  <p/>
  <listBibl type="primary">
    <bibl n="siglum"><ptr target="bib:AuthorYear_01"/></bibl>
  </listBibl>
  <listBibl type="secondary">
    <bibl n="siglum"><ptr target="bib:AuthorYear_01"/></bibl>
  </listBibl>
</div>

In the template, instructions could be added in comment to replace the empty p with the epigraphic lemma and replace the dummy bibl elements in the structured bibliographies with the actual citations relevant to the inscription.

Or, if we don't want to go this way, then the question still remains: should the schema allow empty <bibl> (if yes, then overall, or only in this specific context?) / or shall we comment out all empty bibliographies in existing XML files AND the template(s)?

@michaelnmmeyer
Copy link
Member

The "John Doe" entries are used in various files, not only siddham.

I just allowed empty <bibl> in the schema. I will skip over them in the processing code, as well as over "John Doe" entries.

@danbalogh
Copy link
Collaborator Author

So to be clear, am I right that this means the following:

  • the templates, guides and existing files with empty bibl can (for now at least) remain as they are;
  • the Zotero item bib:AuthorYear_01 (for John Doe and Dharmaputra Devadatta), as well as references to this in the files, can stay as they are, but we needn't encourage the use of this in any guide;
  • if a bibliographic citation references AuthorYear_01, this will not be shown in display.

If so, this sounds good to me; let's see what @arlogriffiths says.

@michaelnmmeyer
Copy link
Member

@danbalogh Yes, exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants