OrderedDemote2To() f64->f32 ? #1903

Pflugshaupt · 2023-12-15T21:37:14Z

I'm migrating my DSP codebase from my own attempt of a library to Highway at the moment. Things went mostly well but I found one thing a bit puzzling: I have some algorithms that work on float lanes, but have to do a intermediate calculations at double precision. My own library allowed having double-as-wide f64 aggregates for that, but I see that highway won't do Twice<d> on full-width tags.
That's fair enough and so I went with PromoteLowerTo() and PromoteUpperTo() to convert each float tag to two double tags.. However to go back to float later I found OrderedDemote2To() is curiously missing for double to float. Is there a specific reason for that or am I missing some other function? I just want to convert N double lanes to N float lanes using half as many registers - it seems like something that would come up quite often with algorithm requiring full float precision results.

I ended up writing this, but it seems a bit silly:

        auto dbl2float = [](auto d, auto a, auto b) HWY_ATTR {
            const Half<decltype(d)> hd;
            return Combine(d, DemoteTo(hd, b), DemoteTo(hd, a));
        };

The text was updated successfully, but these errors were encountered:

jan-wassenberg · 2023-12-18T10:15:06Z

Hi, we don't have f64->f32 OrderedDemote2To because x86 and SVE can't do that very efficiently and we did not yet have a use-case.

However, RVV and NEON could do this a bit more efficiently. Would you be interested in having a go at adding support? That would involve updating quick_reference.md to mention f64->f32 is supported, in demote_test.cc:678 adding ForShrinkableVectors<TestFloatOrderedDemote2To>()(float());, copying your implementation to generic_ops-inl.h with the usual #if (defined(HWY_NATIVE_ 'include guard', and adding implementations to rvv-inl.h and arm_neon-inl.h.

Pflugshaupt · 2023-12-18T15:30:34Z

Ok, I'll give it a try once I'm done migrating to Highway and gained some more experience with it. That'll be in January. Thanks for letting me know I'm not missing a different way to do f64->f32. An issue might be that I have zero experience with Risc-V/RVV.

jan-wassenberg · 2023-12-18T16:36:42Z

Sounds good :)
No worries, RVV already has an existing function for that, it may be enough simply to enable f64->f32 in the template SFINAE. Would also be fine to write a TODO instead, in the meantime that target would be covered by the generic code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OrderedDemote2To() f64->f32 ? #1903

OrderedDemote2To() f64->f32 ? #1903

Pflugshaupt commented Dec 15, 2023

jan-wassenberg commented Dec 18, 2023

Pflugshaupt commented Dec 18, 2023

jan-wassenberg commented Dec 18, 2023

OrderedDemote2To() f64->f32 ? #1903

OrderedDemote2To() f64->f32 ? #1903

Comments

Pflugshaupt commented Dec 15, 2023

jan-wassenberg commented Dec 18, 2023

Pflugshaupt commented Dec 18, 2023

jan-wassenberg commented Dec 18, 2023