Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More ergonomic and more-clearly-correct CPU feature detection. #1833

Open
briansmith opened this issue Nov 28, 2023 · 0 comments
Open

More ergonomic and more-clearly-correct CPU feature detection. #1833

briansmith opened this issue Nov 28, 2023 · 0 comments

Comments

@briansmith
Copy link
Owner

[This proposal assumes that we implement the proposal in #1832.]

First, we should be open to using cfg_if to clarify the conditional compilation logic around target_arch. Now we currently have a lot of #[cfg(any(target_arch = "..."))] directives in the code. Often we define a function multiple times, first using #[cfg(any(target_arch = ""))] and then using cfg(not(any(target_arch = ""))) and sometimes even more times. It becomes confusing as to whether every case is covered, whether there are any overlapping cases, and if there are overlapping cases, are they somehow composed in the right way. This is roughly what the cfg_if crate helps with. Historically we've avoided using cfg_if to minimize external dependencies, but we're already indirectly depending on it because getrandom depends on it, and also cfg_if is now an "official" crate at https://github.com/rust-lang/cfg-if so it's in some sense not an "external" dependency in any way that matters.

Second, we should be using cfg(target_feature = "...") to short-circuit any dynamic feature detection, on all targets. Currently we do this for ARM/Aarch64 only, but not for x86/x86-64. For all target architectures for which we do feature detection, we should have a mapping from target_feature to whatever symbol we use for dynamic feature detection, like we have for AAarch64.

Third, as noted in #1832, it isn't enough to condition on the target architecture and then target features, but also we need to know whether we are able to build the source files that enable the feature. So basically we need a mapping of cfg_directive to set of (set of target_architecture, set of target_feature) tuples), we need to use #[cfg(...)] to unconditionally use any implementations where the source files were compiled and where the target_arch/target_feature sets indicate that the feature is guaranteed to be available, and if the target_arch/target_os/target_feature combination supports dynamic feature detection (which requires that the relevant cfg_directive is set, and more) for that feature we need to do dynamic detection.

This is so complicated that my explanation above probably makes sense, at most, only to me. But, to summarize, we need to make the conditional compilation even more fancy, but the readability of the current compilation is already poor, so we need to switch to a solution that improves readability and then extend it to be even fancier without hurting readability too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant