Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schemas do not work with IBM DFDL #5

Open
smhdfdl opened this issue May 16, 2017 · 1 comment
Open

Schemas do not work with IBM DFDL #5

smhdfdl opened this issue May 16, 2017 · 1 comment

Comments

@smhdfdl
Copy link
Contributor

smhdfdl commented May 16, 2017

See PR #4

@smhdfdl
Copy link
Contributor Author

smhdfdl commented May 16, 2017

  1. IBM DFDL validates the DFDL schema before parsing and it found some missing properties. The DFDL specification says that if a schema object needs a property then you must provide a value, there are no build-in defaults. The missing properties are:
  • textBidi="no"
  • floating="no"
  • encodingErrorPolicy="error"
    Daffodil is presumably not reporting their absence as an error, which strictly speaking is non-compliance with the DFDL spec.
    If you add them into your dfdl:defineFormat then the errors go away.
  1. When parsing the test file, IBM DFDL is reporting an error when it encounters text = "" lines. This I think is due to a non-compliance by IBM DFDL. An erratum changed the DFDL spec so that a required (minOccurs="1") element with type xs:string and a value of "" (empty string) is assigned a default value (if specified) or "" (empty string). But IBM DFDL does not yet implement that erratum, and it treats the value as missing which gives an error.
    I think just setting minOccurs="0" on "text" will fix the parse, but when you serialize you will not see "text" at all, which is probably not what you want.
    So the best fix is to add nillable="true" to the "text" element, and the following properties to your dfdl:defineFormat:
  • nilKind="literalValue"
  • nilValue="%ES;"
  • useNilForDefault="no"
  • nilValueDelimiterPolicy="both"
    This will ensure your "text" elements are parsed ok, but give you a NIL in the infoset instead of empty string. On serializing you will get text = "" output.
  1. With the change for 2) made, the parser gave an error right at the end of the data, because there is an unmodelled x0A at the end of the file.
    You need to add dfdl:terminator="%NL;" to the top-level "praat" element.
    If that final x0A might not be present, then suggest you change property documentFinalTerminatorCanBeMissing from "no" to "yes" in your dfdl:defineFormat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant