Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a
simd_extract_msbs
intrinsic, along with anextract_msbs
re-export tocore:simd
. This intrinsic extracts the most significant bit of each element of a#simd
vector and packs them into abit_set
. This behavior is similar to the SSE/AVXmovemask
intrinsic/movmsk
instruction. This intrinsic is defined similarly to:For example, this code:
will print:
since elements 0, 1, 4, and 7 have their most significant bits set (due to being negative).
This intrinsic is particularly useful in conjunction with lane-wise comparison masks, in a few particular use cases that I've found so far.
masked_compress_store
, for instance, wherecard(extract_msbs(v))
can be used to determine how many elements will be written. Examples:This use case can also be done via
-simd_reduce_add_ordered(mask)
, though I find usingextract_msbs
to be clearer on intent.bit_set
itself can also be directly useful for doing boolean logic en masse. Condensing the original checks to bit-masks allows the bit-masks themselves to be used in SIMD vectors to do large amounts of boolean logic at once.