Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add iEXt chunk #494

Open
ProgramMax opened this issue Jan 29, 2025 · 5 comments
Open

Add iEXt chunk #494

ProgramMax opened this issue Jan 29, 2025 · 5 comments

Comments

@ProgramMax
Copy link
Collaborator

This is part of the more broad #493 discussion.

A new chunk (named iEXt, "image extra") would be added to contain image-like data which doesn't fit PNG's existing RGB / grayscale (with alpha).

It will work exactly like IDAT except each chunk will start with an ID and it may have any number of channels above 0. For example, IDAT might store 1 (grayscale), 2 (grayscale + alpha), 3 (rgb), or 4 (rgb + alpha) channels of data. But iEXt could also store 5 or more.

Currently, the spec requires that all IDAT chunks be contiguous. But some of the extra image data should be interleaved within the IDAT data. We need to investigate if breaking this requirement is problematic.

An example of interleaving IDAT and iEXt is gain maps. It would be bad to display image data while it is being downloaded only to later encounter the gain map data and correct the display. This would flash the user. Interleaving the gain map data would allow only a small delay on the current image piece's display while the next bit of gain map data is downloaded.

@leo-barnes
Copy link

The terminology used by many other image container formats is "auxiliary image". iAUx if we want to be consistent?

@ProgramMax
Copy link
Collaborator Author

Ohhh yes, thank you! Consistency is of course valuable. I prefer that.

@fintelia
Copy link

fintelia commented Feb 3, 2025

Currently, the spec requires that all IDAT chunks be contiguous. But some of the extra image data should be interleaved within the IDAT data. We need to investigate if breaking this requirement is problematic.

I would expect this to be very disruptive. libpng instantly raises an error if it encounters a non-IDAT chunk when it is expecting more pixel data. Which also makes sense in the context of the library's API where you can call png_read_row to get an entire row of pixel data and aren't expecting to get any extra data simultaneously.

Naming bikeshed: The two existing data chunks are called IDAT and fdAT respectively, so perhaps the new chunk should also end in _DAT?

@ProgramMax
Copy link
Collaborator Author

@fintelia libpng is definitely one of the most widely used png libraries. If it errors on interleaving other chunks between IDAT chunks, we need to tread carefully. To the point where that might be a deal breaker.

But a modern website might use newer JS/DOM features that are not available in older browsers. So there is a path forward where we update libpng and allow interleaving, knowing that older libpng cannot handle newer PNGs. And just like authors using those new JS/DOM features, the author can choose to use this feature or not based on if their audience is likely to support it.

But since it would error on old libpng (thank you for finding that), we certainly want to be cautious and deeply consider if it is worth it.

I agree with your "_DAt" naming suggestion. Perhaps aDAt for "auxiliary data"? That would also align with @leo-barnes's comment.

@fintelia
Copy link

fintelia commented Feb 9, 2025

If making backwards-incompatible changes are in-scope, then another possibility is changing the IDAT bitstream format. The additional channels could be directly interleaved into the pixel data, and it would also be an opportunity to increase compression and/or support more bit-depths by incorporating improvements from newer lossless formats.

But assuming that's impractical for now, I still see some challenges for decoders wanting to support interleaved IDAT and aDAt chunks. For one thing, having an arbitrary number of zlib bitstreams active at once is going to be harder to make fast than only a single bitstream. Efficient decoding a zlib bitstream requires keeping O(10KB) of decoding tables hot in the L1 cache in addition to the data itself. Every switch is going to push the old decoding tables out of cache and incur stalls as the new tables get brought back in.

Another aspect is that while encoders may try to keep the streams relatively synchronized, decoders won't have any guarantees. Thus, they'll have to handle the possibility of streams advancing at different rates. Code that tries to do streaming of pixel data and gain map data together has to consider the possibility that it'll get an entire image worth of pixel data before any gain map data, or vice versa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants