Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: separate out encoding of chroma & alpha #2374

Open
wadetregaskis opened this issue Aug 6, 2024 · 4 comments
Open

Feature request: separate out encoding of chroma & alpha #2374

wadetregaskis opened this issue Aug 6, 2024 · 4 comments

Comments

@wadetregaskis
Copy link

From looking at the code, it appears AVIFs with transparency are essentially just two AV1 images, each encoded independently - one for chroma and one for alpha. libavif currently forces [re]encoding of both of those, even if one has already been encoded previously and has not changed. This can be very wasteful in some use cases.

Consider for example a GUI over libavif which allows the user to independently control the quality setting for chroma vs alpha, via e.g. two sliders. It's unfortunate that currently a bunch of CPU time has to be wasted redundantly re-encoding e.g. the alpha channel when only the chroma quality slider is changed.

I'm not sure what the best API change would be, to support this. Presumably some way for me to provide the encoder with the earlier encoder (or final output data) that it can use if applicable. Or some way to reuse an encoder, any number of times, with settings tweaked between uses? (as far as I can tell, changes to encoder configuration only apply to future avifEncoderAddImage calls at best, and avifEncoderFinish is only supposed to be called once?)

@y-guyon
Copy link
Collaborator

y-guyon commented Aug 7, 2024

Thank you for your interest in libavif.

From looking at the code, it appears AVIFs with transparency are essentially just two AV1 images, each encoded independently - one for chroma and one for alpha.

This is correct.

libavif currently forces [re]encoding of both of those, even if one has already been encoded previously and has not changed. This can be very wasteful in some use cases.

I expect the number of use cases where only a subset of the channels need to be reencoded to be fairly small.

Consider for example a GUI over libavif which allows the user to independently control the quality setting for chroma vs alpha, via e.g. two sliders. It's unfortunate that currently a bunch of CPU time has to be wasted redundantly re-encoding e.g. the alpha channel when only the chroma quality slider is changed.

I agree.

I'm not sure what the best API change would be, to support this. Presumably some way for me to provide the encoder with the earlier encoder (or final output data) that it can use if applicable. Or some way to reuse an encoder, any number of times, with settings tweaked between uses? (as far as I can tell, changes to encoder configuration only apply to future avifEncoderAddImage calls at best, and avifEncoderFinish is only supposed to be called once?)

What about:

  • encoding the full image with the alpha layer the first time, recording the whole file size and the size of each internal AV1 image item (thanks to avifIOStats),
  • the next times, only encode either the color channels as an opaque image, or the alpha channel as an opaque monochrome image. The final whole file size can be found with the lengths stored in the first pass, and the layers of the multiple images can be composited into a single translucent image before rendering the GUI.

One would have to be careful with alpha-multiplied samples though.

Alternatively there may be libraries that can work with ISOBMFF-style container boxes such as mp4box but I doubt introducing another dependency is the point here.

@wadetregaskis
Copy link
Author

What about:

  • encoding the full image with the alpha layer the first time, recording the whole file size and the size of each internal AV1 image item (thanks to avifIOStats),
  • the next times, only encode either the color channels as an opaque image, or the alpha channel as an opaque monochrome image. The final whole file size can be found with the lengths stored in the first pass, and the layers of the multiple images can be composited into a single translucent image before rendering the GUI.

Yeah, that's not too difficult for me to do, I think. I already keep careful track of the file components' sizes.

Would there be a way to produce the final file without having to re-encode everything, though? I don't currently see an API for explicitly providing existing, compressed image layers.

If not, then (aside from the inefficiency of redundant re-encodes) for my purposes I'm not sure it'd be tenable, as once the image is displayed on screen the user can e.g. drag-and-drop it into another application. Since encodes of non-trivial images take a long time, and even just a second of delay is unacceptable, the fully-composed, final file data needs to be basically ready to go at screen render time.

@y-guyon
Copy link
Collaborator

y-guyon commented Aug 9, 2024

Would there be a way to produce the final file without having to re-encode everything, though? I don't currently see an API for explicitly providing existing, compressed image layers.

This is not possible with the libavif API as of today.

If your files always use the same pattern, you could look at reconstructing them yourselves. The HEIF container format is rather complex but if there is always a single alpha auxiliary image item attached to a single primary color image item with no other item or non-essential property, you would just have to retrieve the AV1 payloads for each of these image items, replace them in the final file, and update these fields: 'iloc' sizes and offsets, 'ispe' width and height, 'mdat' box size (which could be 0 for simplicity, meaning "till end of file").

@y-guyon
Copy link
Collaborator

y-guyon commented Aug 9, 2024

Prototype: #2381

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants