iirc x86_64 without avx512 is the only major isa that uses full-masks instead of bit-masks and has gather, so we should match it. https://www.felixcloutier.com/x86/vgatherdps:vgatherqps#vgatherqps--vex-128-version-iirc `vgatherqps` gathers `f32` values using `i32` mask elements and `u64` indexes/addresses. originally mentioned here: https://github.com/rust-lang/portable-simd/pull/322#discussion_r1048061282