Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal representations in uncefact - jsonschema #6

Open
DanielBauman88 opened this issue Jun 1, 2023 · 2 comments
Open

Decimal representations in uncefact - jsonschema #6

DanielBauman88 opened this issue Jun 1, 2023 · 2 comments

Comments

@DanielBauman88
Copy link

DanielBauman88 commented Jun 1, 2023

The regex for decimal strings in the schema is proposed as "^([+-]?(0?|[1-9][0-9]*)(\\.?\\d+))$"

This isn't compatible with json's numeric representations and isn't compatible with most languages' float/doule toString implementations which will use E notation when the exponent is large enough.

This is just a question on whether there'd be interest in expanding this regex to allow exponent notation. The reason is that it's a lot easier for users of this jsonschema to convert their json or domain data to the appropriate types.

With exponents supported it's a simple toString, without them a user needs to first use a library to get a full positional string - this creates some friction whenever converting number to strings that match the jsonschema.

(not sure if this spec is finalized or still in draft)

@AP-G
Copy link
Contributor

AP-G commented Jun 1, 2023

Hi Daniel,

It is on purpose that it is NOT compatible with float / double, as it is the decimal type. The difference is that float / double always stores numbers on a base-2, while decimal does store the number "as it is". For instance, in Java you must use java.math.BigDecimal and not double / float. With double / float, it is (mathematically) not possible to store precise numbers, especially if the accuracy of a number is most important. For a business example, google for the old Pentium-Bug, where the payslips of certain chief executives were calculated incorrectly due to a double / float problem. If you compare it to XML – it is the same reason why XML supports both decimal and double. But you do not have to use that high numbers. Just look at the discussion on the validation artefacts of the European Norm for electronic invoices for public administration (EN16931). The commonly used validation engine (Saxon) only supports float and double values, but no decimals. There are tons of workarounds being implemented up to legal consequences of the introduction of thresholds – just because one (common) software implementation is not supporting decimals in the correct way. And all that is done there in a business document is summing up some line items amounts and calculating VAT amounts. With the support of decimal, all would be fine…

@DanielBauman88
Copy link
Author

DanielBauman88 commented Jun 1, 2023

Thanks for this clear response.

I understand the reasoning for representing the decimal as a string in jsonschema to avoid all the accuracy problems that arise with using floats or doubles.

What I'm talking about here is simply expanding the supported string regex so that people converting their business domain numbers into the appropriate string for the jsonschema decimal don't have to do extra work.

So for people who already use double/float in their business domain it just makes it easier for them to convert those values into decimals using toString.

This also applies for customers using precise types. EG a customer using java's BigDecimal needs to know that for compatibility with the edifact jsonschema they have to use toPlainString because toString can give values that will fail the edifact jsonschema regex. This is a gotcha and adds some extra effort.
https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html#toPlainString--
https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html#toString--

Other languages frequently don't have a simple function like toPlainString which makes it a step worse.

EG: Javascript decimal does not (as far as I can see), neither does bigdecimal-rs in rust.
Converting to a string with no "E" notation requires some extra work in these languages which makes working with the jsonschema harder.

As an example - the arbitrary precision libraries I've seen (like the ones listed in the edifact jsonschema pdf, support strings with exponents into decimal values.

TLDR; using a string in the jsonschema and not a number is a good thing - because of json generally parsing these numbers as floats/doubles. However, not allowing exponents in the string representation can make it hard to work with and there's nothing imprecise about representing a number correctly with exponents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants