Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding float16 lossless appears to produce artifacts for specific values #3881

Open
Skielex opened this issue Oct 7, 2024 · 3 comments
Open
Labels
bug Something isn't working encoder Related to the libjxl encoder unrelated to 1.0 Things that need not be done before the 1.0 version milestone

Comments

@Skielex
Copy link

Skielex commented Oct 7, 2024

Describe the bug
When using lossless float16 encoding, specific ranges of binary values appear to be corrupted. Specifically, it appears that all values with a binary representation corresponding to the uint16 ranges [512:1023], [31745:32767], [33280:33793], and [64513:65534] are changed during an encode-decode cycle.

To Reproduce
Steps to reproduce the behavior:

  1. Encode an image with any float16 value with a binary representation corresponding to the uint16 ranges [512:1023], [31745:32767], [33280:33793], and [64513:65534].
  2. Decode the image.
  3. Check that the image data has changed.

Expected behavior
Values should not change during a lossless encode-decode cycle. I've tested float32 encoding with all possible values and it works as expected.

Screenshots
Affected value ranges in yellow:
image
Bit values (yellow = True) for all 65536 float16 values with affected ranges between red and blue lines.
image

Environment

  • OS: Ubuntu WSL2 on Windows (+ Windows, AMD64 and ARM64, see JPEGXL lossless float16 is not lossless cgohlke/imagecodecs#114)
  • Compiler version: GCC 11 (I assume, it's a Python 3.10 extension)
  • CPU type: AMD Ryzen 5900X
  • libjxl version: 0.11.0 (according to this)
  • cjxl/djxl version string: Cannot test with cjxl/djxl as they don't appear to support any formats that allow float16.

Additional context
I found this issue using the imagecodes Python package. Original issue is cgohlke/imagecodecs#114.

As mentioned in the issue linked above, I've not been able to test for the issue using cjxl/djxl or GIMP due to a lack of float16 support.

@jyrkialakuijala
Copy link
Contributor

jyrkialakuijala commented Oct 11, 2024

I didn't look at the code, just speculating about this from a belief-based viewpoint.

The format itself stores these as integers and does prediction and other processing as if they were integers -- and it is highly unlikely that there is an issue there.

The phenomena could be due to some of the following:
-Inf, +Inf, NaN, negative zero vs. positive zero, Denormalized Numbers, other approximations used in 16 bit floating point calculations in the calling code rather than in JPEG XL

@Skielex
Copy link
Author

Skielex commented Oct 11, 2024

I don't think the issue is related to special float values, although they too are affected, for two reasons:

  1. The first affected region of values consists of float16 values between 3.0517578125e-05 and 6.097555160522461e-05, which become values between 0.0 and 6.091594696044922e-05 after encode-decode.
  2. The float32 encode-decode does not suffer from this issue. I've tested all possible float32 binary values and they were all preserved with lossless encoding.

The true uint16 representation of the values in the first region are:

512,  513,  514, ..., 1021, 1022, 1023

However, after encode-decode they become:

0, 2, 4, ..., 1018, 1020, 1022

The next value, 1024 (6.103515625e-05 as float16), is preserved.

@mo271 mo271 added bug Something isn't working encoder Related to the libjxl encoder unrelated to 1.0 Things that need not be done before the 1.0 version milestone labels Oct 14, 2024
@kmilos
Copy link

kmilos commented Oct 23, 2024

It looks indeed like the differences are in the special inf/NaN and subnormal numbers.

I don't think the issue is related to special float values

The true uint16 representation of the values in the first region are:

512, 513, 514, ..., 1021, 1022, 1023

But these are (half of the) positive subnormal numbers in float16 binary representation.

[31745:32767]

These are "positive" NaNs.

[33280:33793]

These are (half of the) negative subnormals.

[64513:65534]

These are "negative" NaNs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working encoder Related to the libjxl encoder unrelated to 1.0 Things that need not be done before the 1.0 version milestone
Projects
None yet
Development

No branches or pull requests

4 participants