More ergonomic and more-clearly-correct CPU feature detection. #1833

briansmith · 2023-11-28T21:16:51Z

[This proposal assumes that we implement the proposal in #1832.]

First, we should be open to using cfg_if to clarify the conditional compilation logic around target_arch. Now we currently have a lot of #[cfg(any(target_arch = "..."))] directives in the code. Often we define a function multiple times, first using #[cfg(any(target_arch = ""))] and then using cfg(not(any(target_arch = ""))) and sometimes even more times. It becomes confusing as to whether every case is covered, whether there are any overlapping cases, and if there are overlapping cases, are they somehow composed in the right way. This is roughly what the cfg_if crate helps with. Historically we've avoided using cfg_if to minimize external dependencies, but we're already indirectly depending on it because getrandom depends on it, and also cfg_if is now an "official" crate at https://github.com/rust-lang/cfg-if so it's in some sense not an "external" dependency in any way that matters.

Second, we should be using cfg(target_feature = "...") to short-circuit any dynamic feature detection, on all targets. Currently we do this for ARM/Aarch64 only, but not for x86/x86-64. For all target architectures for which we do feature detection, we should have a mapping from target_feature to whatever symbol we use for dynamic feature detection, like we have for AAarch64.

Third, as noted in #1832, it isn't enough to condition on the target architecture and then target features, but also we need to know whether we are able to build the source files that enable the feature. So basically we need a mapping of cfg_directive to set of (set of target_architecture, set of target_feature) tuples), we need to use #[cfg(...)] to unconditionally use any implementations where the source files were compiled and where the target_arch/target_feature sets indicate that the feature is guaranteed to be available, and if the target_arch/target_os/target_feature combination supports dynamic feature detection (which requires that the relevant cfg_directive is set, and more) for that feature we need to do dynamic detection.

This is so complicated that my explanation above probably makes sense, at most, only to me. But, to summarize, we need to make the conditional compilation even more fancy, but the readability of the current compilation is already poor, so we need to switch to a solution that improves readability and then extend it to be even fancier without hurting readability too much.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More ergonomic and more-clearly-correct CPU feature detection. #1833

More ergonomic and more-clearly-correct CPU feature detection. #1833

briansmith commented Nov 28, 2023

More ergonomic and more-clearly-correct CPU feature detection. #1833

More ergonomic and more-clearly-correct CPU feature detection. #1833

Comments

briansmith commented Nov 28, 2023