Skip to content

Conversation

maribu
Copy link
Member

@maribu maribu commented Sep 17, 2025

Contribution description

  • Add float_math feature to indicate support of floating point arithmetic
  • Make every CPU but native provide this
  • Add FEATURES_REQUIRED += float_math to all drivers/modules that use floating point arithmetic
  • On native when running on x86_64: Disable use of FPU/SSE registers on native unless float_math is in used
    • With CONTINUE_ON_EXPECTED_ERRORS=1 it is possible to use float_math on native. But then we are back to having random crashes / incorrect computations / stack corruptions / floating point exceptions on context switching

Testing procedure

  • On boards other than native, nothing should change
  • A number of apps that previously have been blacklisted by hand on native no longer need that, either because they use floating point arithmetic (and now automatically won't be build) or should work now

Issues/PRs references

Fixes #495 by properly modeling this in the build system. But more importantly: It fixes issues where the compiler spilled values onto SSE/FPU registers without the use of floating point arithmetic (e.g. seen in the wild in nanopb).

@maribu maribu added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Sep 17, 2025
@github-actions github-actions bot added Platform: native Platform: This PR/issue effects the native platform Platform: MSP Platform: This PR/issue effects MSP-based platforms Platform: ARM Platform: This PR/issue effects ARM-based platforms Area: tests Area: tests and testing framework Area: build system Area: Build system Platform: AVR Platform: This PR/issue effects AVR-based platforms Area: drivers Area: Device drivers Area: LoRa Area: LoRa radio support Platform: ESP Platform: This PR/issue effects ESP-based platforms Platform: RISC-V Platform: This PR/issue effects RISC-V-based platforms Area: cpu Area: CPU/MCU ports Area: sys Area: System labels Sep 17, 2025
@riot-ci
Copy link

riot-ci commented Sep 17, 2025

Murdock results

FAILED

56be5ab cpu/native: fix FPU/SSE register related context switching issues

Success Failures Total Runtime
124 0 9335 32s

Artifacts

@crasbe crasbe added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Sep 17, 2025
maribu and others added 8 commits September 17, 2025 20:44
This adds the float_math feature to express that math on floating point
numbers is supported. All CPUs but native provide this, to reflect that
apps using floats on native will randomly fail

See RIOT-OS#495

Co-authored-by: crasbe <[email protected]>
In some modules some functions are disabled when no floating point
arithmetics is available instead of blocking the whole module. The API
doc has been extended, so that users know why a function may fail to
link.

Co-authored-by: crasbe <[email protected]>
We just use a simple look-up table for the 10^x exponents that do not
overflow the range of int16_t. In addition, we saturate on overflows.
When the `float_math` feature is not in used, this adds CFLAGS to
instruct the compiler to not make use of FPU/SSE registers, fixing
issues (random crashes, floating point exceptions, stack corruptions,
incorrect computations) due to incorrect back and restore of those
registers on context switching.

If `float_math` is used, the buggy behavior as before is produced.
But since `float_math` is not actually provided by the native CPU,
one would have to compile `CONTINUE_ON_EXPECTED_ERRORS=1` to actually
get into that situation.

Co-authored-by: Mihai Renea <[email protected]>
@github-actions github-actions bot added Area: pkg Area: External package ports Area: examples Area: Example Applications labels Sep 18, 2025
@crasbe
Copy link
Contributor

crasbe commented Sep 18, 2025

Wouldn't it make more sense to add FEATURES_REQUIRED += float_math as a prerequisite when using the printf_float USEMODULE instead of adding it to all the modules?

IMO it's bad enough that printf just silently doesn't print floats when the printf_float module is not loaded 😅

Comment on lines +266 to +268
ifneq (,$(filter printf_float,$(USEMODULE)))
FEATURES_REQUIRED += float_math
endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is actually done here. But most apps that do print float/double also do math with float/double before printing it.

It would be possible to actually implement the %f format specifier in printf without using floating point arithmetic directly. But then those apps would still not work on systems that do not provide floating point arithmetic if the use float math themselves.

LINKFLAGS += $(shell pkg-config libucontext --libs)
endif

ifeq (x86_64,$(OS_ARCH))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we also need that on 32 bit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I wonder if someone actually still has an i585/i686 CPU in use these days.

It won't hurt to add the check, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more about testing the code on a 32 bit target which is what most of our targets are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the output of uname -m would still be x86_64, even with BOARD=native32.

But I honestly don't really see a path forward here with including C++ headers resulting in compilation errors when SSE is disabled.

Copy link
Contributor

@benpicco benpicco Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I thought OS_ARCH was referring to RIOT 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: build system Area: Build system Area: cpu Area: CPU/MCU ports Area: drivers Area: Device drivers Area: examples Area: Example Applications Area: LoRa Area: LoRa radio support Area: pkg Area: External package ports Area: sys Area: System Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: ARM Platform: This PR/issue effects ARM-based platforms Platform: AVR Platform: This PR/issue effects AVR-based platforms Platform: ESP Platform: This PR/issue effects ESP-based platforms Platform: MSP Platform: This PR/issue effects MSP-based platforms Platform: native Platform: This PR/issue effects the native platform Platform: RISC-V Platform: This PR/issue effects RISC-V-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

native not float safe

4 participants