Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decoding hooks #2372

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Add decoding hooks #2372

wants to merge 8 commits into from

Conversation

fintelia
Copy link
Contributor

This is an attempt at implementing the format decoding hooks registry described in #2368. It works out to be less code than I was expecting.

At the moment the hooks aren't especially useful because the only image formats that hooks can be registered for already have built-in decoders, but that code be changed in followup PRs

src/hooks.rs Outdated Show resolved Hide resolved
src/hooks.rs Outdated
Comment on lines 39 to 45
pub fn register_decoding_hook(format: ImageFormat, hook: DecodingHook) {
let mut hooks = DECODING_HOOKS.write().unwrap();
if hooks.is_none() {
*hooks = Some(HashMap::new());
}
hooks.as_mut().unwrap().insert(format, hook);
}
Copy link
Member

@HeroicKatora HeroicKatora Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the internal mechanism, but am not completely sure about the interface. It seems primarily designed around the implementation instead of user expectations. For instance, if we consider the case of third-party support for a decoder that isn't enabled by feature switches then we'd expect both decoder and encoder in a single block / call / …. Maybe this isn't crucial, just want to consider if the experience.

It's possible to register a hook that's ignored, and that ignore can happen by a feature flag enabled via a sibling create. This poses the question, should there be a mechanism for discovering the utilized features and should the internal translation pre-register itself in the map to provide a more uniform view?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registering both encoding and decoding in one go can always be added later as a dedicated function. A 3rd party codec crate can always provide their method to register themselves, like my_codec_crate::register().

But it could be common to want different implementations for decoding and encoding. For decoding you may want a decoder that is safe and/or compatible, but for encoding you may prefer either a fast one or heavily-compressing one (safety is usually less of an issue when encoding).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was kind of imagining an external crate would expose its own register() function that registered all the encoders/decoders it had. That could be only a single decoder, or several encoders and decoders all defined in the same crate

I didn't add encoding support to this PR because at the moment Box<dyn ImageEncoder> isn't actually usable for encoding images (you can't call write_image because that requires Self: Sized).

src/hooks.rs Outdated Show resolved Hide resolved
@fintelia fintelia marked this pull request as ready for review October 29, 2024 03:47
@Shnatsel
Copy link
Contributor

Shnatsel commented Nov 5, 2024

We can kick the tires of this API by splitting PCX into a separate crate and using these hooks to wire it up.

Also, jxl-oxide has shipped image integration just days ago, and it would be a good test bed for a more modern format.

@fintelia
Copy link
Contributor Author

fintelia commented Nov 6, 2024

@HeroicKatora how would you feel about moving forward with this PR and then spinning out a single image-extras crate that uses hooks to support additional non-default format bindings, like the PCX decoder that was just written?

@Shnatsel
Copy link
Contributor

Shnatsel commented Nov 6, 2024

That sounds like a good way to dogfood the hooks.

IMHO there's also value in segregating the formats with weird licenses. PCX is under WTFPL, we probably don't want to have that as a dependency in the main crate. And all the RAW crates are under LGPL, it would be nice to signpost that as well. There's precedent in GStreamer, where decoders with weird licenses are segregated as gstreamer-plugins-ugly.

Although the PCX author is still around, let me ask them to relicense and see what comes of it.

@Shnatsel
Copy link
Contributor

Shnatsel commented Nov 6, 2024

pcx crate has been relicensed as WTFPL OR MIT OR Apache 2.0, so the license should no longer pose any issues.

@HeroicKatora
Copy link
Member

HeroicKatora commented Nov 7, 2024

The biggest concern being solved with an extra crate is the licensing / static dependency tree situation. However, I can also see users expecting formats to 'just be available' when depending on crates, similar to the discover mechanism of the traditional c libraries. We can't introduce life-before-main in Rust however.

This combination of requirements is not possible to resolve right now when moving to a separate crate, afaik. To automatically register the decoders from image-extra, I think we would need image to depend on image-extra (optionally). However first of all that is cyclic, so some common image-core containing the actual registry is required. Secondly, upstream needs to provide appropriate weak feature selections as appropriate image/image-extra must be explicitly enabled for it to be visible in cfg.

I think the hooks can address the needs for users to substitute some formats, but they do not appropriately solve our internal needs. And in this implementation, the substitution does not work in all situations as intended as the feature selection is always given priority.

@Shnatsel
Copy link
Contributor

Shnatsel commented Nov 7, 2024

On the plus side, splitting out image-extra would also let us break semver on those decoders without impacting image. I doubt that's going to be relevant to PCX though.

But now I think the licensing argument is not as strong as I made it out to be. AFAIU bundling a bunch of formats into -extra isn't really an improvement from the licensing standpoint, and the licensing motivation for PCX is gone regardless.

@HeroicKatora
Copy link
Member

HeroicKatora commented Nov 7, 2024

Addressing the use-case of substituting the builtin decoder with a personal one is quite valid. I'd be glad to approve on the PR and registry design if the registered hooks ran with higher priority than the builtin format dispatch. The trait boxing of Read etc. is neatly wrapped here. I don't think this competes with a design for image-extra. If that needs different priorities then we can write them; and the registration as in this API will be necessary anyhow.

@Shnatsel
Copy link
Contributor

Shnatsel commented Nov 7, 2024

If all you want is to substitute the decoder a known format with a custom implementation you don't need the hook machinery at all.

You can simply disable the support for this format in image and then handle the UnsupportedError with a precise ImageFormat and route it to a custom decoder for that format.

@fintelia
Copy link
Contributor Author

fintelia commented Nov 7, 2024

My top-level goal is finding a way to spend less time thinking about, reviewing, and supporting obscure formats.

IMO most of the effort around obscure formats is preparing/preventing them from causing issues for the rest of the crate. Any PR that touches the crate's Cargo.toml or lib.rs might accidentally (or maliciously) break them. Test images bloat the size of the repository. If some dependency yanks their crate, we may have to fork and publish our own copy. Some codec changes like swapping the backend decoder implementation might require a semver break.

Separating out less common formats into a different crate provides a much nicer boundary to be more lax about reviews and concerns like those. If a codec breaks, we can drop support in a new major version. If CI is broken, it doesn't block PRs to the main image crate. If a decoder uses unsafe code, it doesn't prevent the main crate from eventually having #![forbid(unsafe_code)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants