Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] pow() nuttx libmath return wrong value #13286

Closed
1 task done
nicolas71640 opened this issue Sep 3, 2024 · 17 comments · Fixed by #13294 or #13307
Closed
1 task done

[BUG] pow() nuttx libmath return wrong value #13286

nicolas71640 opened this issue Sep 3, 2024 · 17 comments · Fixed by #13294 or #13307
Assignees
Labels
Arch: arm Issues related to ARM (32-bit) architecture OS: Linux Issues related to Linux (building system, etc) OS: Other Issues related to other OS (not Linux, Mac or Win) Type: Bug Something isn't working

Comments

@nicolas71640
Copy link
Contributor

Description / Steps to reproduce the issue

Here's a small example 1:

    for (int32_t i = 0; i < 3; i++)
    {
        printf("%f", std::pow(500.f, i));
    }

And the results on xmc4800-relax with the lib math package with nuttx

       1.000000
       485165486.398179
       235385549191982000.000000

I've tried to compile with the gcc libmath, and no problem.
If a make i goes from 0 to 2 (instead of 3), my problem disapears...

Another strange issue is that this example 2 works :

        printf("%f", std::pow(500.f, 0));
        printf("%f", std::pow(500.f, 1));
        printf("%f", std::pow(500.f, 2));

In the case of example 1, I see that the pow() of libm is called. In the example 2, this function is not called. I have no idea why.

Thank you for any help understanding this issue...

On which OS does this issue occur?

[Linux]

What is the version of your OS?

Ubuntu 22

NuttX Version

12.6.0

Issue Architecture

[arm]

Issue Area

[Other]

Verification

  • I have verified before submitting the report.
@nicolas71640 nicolas71640 added the Type: Bug Something isn't working label Sep 3, 2024
@github-actions github-actions bot added Arch: arm Issues related to ARM (32-bit) architecture OS: Linux Issues related to Linux (building system, etc) OS: Other Issues related to other OS (not Linux, Mac or Win) labels Sep 3, 2024
@acassis
Copy link
Contributor

acassis commented Sep 3, 2024

@nicolas71640 did you try: pow(500.0f, (float) i); ?

@pkarashchenko any idea?

@nicolas71640
Copy link
Contributor Author

Just did... and same result

@pkarashchenko
Copy link
Contributor

Will need some debugging. The idea would be to check if there is FPU and if it is enabled. Like to check soft float vs hard float case.

@pkarashchenko
Copy link
Contributor

I stepped into similar strange behavior a long time ago, but not with Nuttx and here are the steps to check that can explain the observation:

  1. std::pow(500.f, 0) is slightly different from std::pow(500.f, i) in terms of compiler optimization. I suppose that optimization is enabled for the code in both example 1 and 2. But in example 2 the compiler calculates std::pow(500.f, 0), std::pow(500.f, 1) and std::pow(500.f, 2) and emits code that that just prints the constants. I think you can check the generated assembly to confirm.
  2. First std::pow(500.f, i) in the loop also is calculated at compile time as i is equal to zero (known) while other two calls actually run buggy pow. Again, please check the assembly code to prove it.
  3. pow really contains a bug that should be investigated. But the test case should either be compiled with -O0 or volatile should be added to i declaration to have consistent behavior.

@pkarashchenko
Copy link
Contributor

pkarashchenko commented Sep 4, 2024

@nicolas71640 could you please update description and add some steps that use configure.sh and some test config? Also add a test config that you use.
I found that such behavior is possible if Nuttx is misconfigured and just want to reproduce your case locally.
Do you run make clean between two test cases?

@nicolas71640
Copy link
Contributor Author

nicolas71640 commented Sep 4, 2024

I stepped into similar strange behavior a long time ago, but not with Nuttx and here are the steps to check that can explain the observation:

  1. std::pow(500.f, 0) is slightly different from std::pow(500.f, i) in terms of compiler optimization. I suppose that optimization is enabled for the code in both example 1 and 2. But in example 2 the compiler calculates std::pow(500.f, 0), std::pow(500.f, 1) and std::pow(500.f, 2) and emits code that that just prints the constants. I think you can check the generated assembly to confirm.
  2. First std::pow(500.f, i) in the loop also is calculated at compile time as i is equal to zero (known) while other two calls actually run buggy pow. Again, please check the assembly code to prove it.
    Indeed, the assembly code that the pow() function is not called for the example 2. Here's a reduced code and the assembly :
  int32_t i = 1;
  printf("%f\n", pow(500.f, 1));
  printf("%f\n", pow(500.f, i));
0c01a050 <helloxx_main>:
 c01a050:       b500            push    {lr}
 c01a052:       b085            sub     sp, #20
 c01a054:       9001            str     r0, [sp, #4]
 c01a056:       9100            str     r1, [sp, #0]
 c01a058:       2301            movs    r3, #1
 c01a05a:       9303            str     r3, [sp, #12]
 c01a05c:       f04f 0200       mov.w   r2, #0
 c01a060:       4b0f            ldr     r3, [pc, #60]   ; (c01a0a0 <helloxx_main+0x50>)
 c01a062:       4810            ldr     r0, [pc, #64]   ; (c01a0a4 <helloxx_main+0x54>)
 c01a064:       f001 f924       bl      c01b2b0 <printf>
 c01a068:       9803            ldr     r0, [sp, #12]
 c01a06a:       f7fb fd67       bl      c015b3c <__aeabi_i2d>
 c01a06e:       4602            mov     r2, r0
 c01a070:       460b            mov     r3, r1
 c01a072:       ec43 2b11       vmov    d1, r2, r3
 c01a076:       ed9f 0b08       vldr    d0, [pc, #32]   ; c01a098 <helloxx_main+0x48>
 c01a07a:       f001 fc09       bl      c01b890 <pow>
 c01a07e:       ec53 2b10       vmov    r2, r3, d0
 c01a082:       4808            ldr     r0, [pc, #32]   ; (c01a0a4 <helloxx_main+0x54>)
 c01a084:       f001 f914       bl      c01b2b0 <printf>
 c01a088:       2300            movs    r3, #0
 c01a08a:       4618            mov     r0, r3
 c01a08c:       b005            add     sp, #20
 c01a08e:       f85d fb04       ldr.w   pc, [sp], #4
 c01a092:       bf00            nop
 c01a094:       f3af 8000       nop.w
 c01a098:       00000000        .word   0x00000000
 c01a09c:       407f4000        .word   0x407f4000
 c01a0a0:       407f4000        .word   0x407f4000
 c01a0a4:       0c01d91c        .word   0x0c01d91c

  1. pow really contains a bug that should be investigated. But the test case should either be compiled with -O0 or volatile should be added to i declaration to have consistent behavior.

With -00, the behavior is consistent and the above code displays :

500.000000
485165486.398179

IMPORTANT THING that I haven't mention before (my bad), is that I compile c++. The c code works perfectly, the c++ fails. I have no clue why...

@nicolas71640 could you please update description and add some steps that use configure.sh and some test config? Also add a test config that you use. I found that such behavior is possible if Nuttx is misconfigured and just want to reproduce your case locally. Do you run make clean between two test cases?

Here's my defconfig :

#
# This file is autogenerated: PLEASE DO NOT EDIT IT.
#
# You can use "make menuconfig" to make any modifications to the installed .config file.
# You can then do "make savedefconfig" to generate a new defconfig file that includes your
# modifications.
#
# CONFIG_ARCH_RAMFUNCS is not set
CONFIG_ARCH="arm"
CONFIG_ARCH_BOARD="xmc4800-relax"
CONFIG_ARCH_BOARD_XMC4800RELAX=y
CONFIG_ARCH_BUTTONS=y
CONFIG_ARCH_CHIP="xmc4"
CONFIG_ARCH_CHIP_XMC4800=y
CONFIG_ARCH_CHIP_XMC4=y
CONFIG_ARCH_INTERRUPTSTACK=2048
CONFIG_ARCH_IRQBUTTONS=y
CONFIG_ARCH_STACKDUMP=y
CONFIG_BOARD_LOOPSPERMSEC=8000
CONFIG_BUILTIN=y
CONFIG_DEBUG_NOOPT=y
CONFIG_EXAMPLES_HELLOXX=y
CONFIG_HAVE_CXX=y
CONFIG_INIT_ENTRYPOINT="nsh_main"
CONFIG_INTELHEX_BINARY=y
CONFIG_LIBCXXTOOLCHAIN=y
CONFIG_LIBM=y
CONFIG_NSH_ARCHINIT=y
CONFIG_NSH_BUILTIN_APPS=y
CONFIG_NSH_FILEIOSIZE=512
CONFIG_NSH_LINELEN=64
CONFIG_NSH_READLINE=y
CONFIG_PREALLOC_TIMERS=4
CONFIG_RAM_SIZE=65536
CONFIG_RAM_START=0x20000000
CONFIG_RAW_BINARY=y
CONFIG_RR_INTERVAL=200
CONFIG_SCHED_HPWORK=y
CONFIG_SCHED_WAITPID=y
CONFIG_START_DAY=10
CONFIG_START_MONTH=3
CONFIG_START_YEAR=2014
CONFIG_SYSTEM_NSH=y
CONFIG_TESTING_RAMTEST=y
CONFIG_UART0_SERIAL_CONSOLE=y
CONFIG_XMC4_USIC0=y
CONFIG_XMC4_USIC0_CHAN1_NONE=y

@nicolas71640
Copy link
Contributor Author

Here's the two assembly code in c++ and c. The c works, the c++ doesn't.

C++

0c01a038 <helloxx_main>:
 c01a038:       b500            push    {lr}
 c01a03a:       b085            sub     sp, #20
 c01a03c:       9001            str     r0, [sp, #4]
 c01a03e:       9100            str     r1, [sp, #0]
 c01a040:       2301            movs    r3, #1
 c01a042:       9303            str     r3, [sp, #12]
 c01a044:       9803            ldr     r0, [sp, #12]
 c01a046:       f7fb fd79       bl      c015b3c <__aeabi_i2d>
 c01a04a:       4602            mov     r2, r0
 c01a04c:       460b            mov     r3, r1
 c01a04e:       ec43 2b11       vmov    d1, r2, r3
 c01a052:       ed9f 0b07       vldr    d0, [pc, #28]   ; c01a070 <helloxx_main+0x38>
 c01a056:       f001 fc03       bl      c01b860 <pow>
 c01a05a:       ec53 2b10       vmov    r2, r3, d0
 c01a05e:       4806            ldr     r0, [pc, #24]   ; (c01a078 <helloxx_main+0x40>)
 c01a060:       f001 f910       bl      c01b284 <printf>
 c01a064:       2300            movs    r3, #0
 c01a066:       4618            mov     r0, r3
 c01a068:       b005            add     sp, #20
 c01a06a:       f85d fb04       ldr.w   pc, [sp], #4
 c01a06e:       bf00            nop
 c01a070:       00000000        .word   0x00000000
 c01a074:       407f4000        .word   0x407f4000
 c01a078:       0c01d8c4        .word   0x0c01d8c4

C

0c01a9a0 <hello_main>:
 c01a9a0:       b500            push    {lr}
 c01a9a2:       b085            sub     sp, #20
 c01a9a4:       9001            str     r0, [sp, #4]
 c01a9a6:       9100            str     r1, [sp, #0]
 c01a9a8:       2301            movs    r3, #1
 c01a9aa:       9303            str     r3, [sp, #12]
 c01a9ac:       9803            ldr     r0, [sp, #12]
 c01a9ae:       f7fb f897       bl      c015ae0 <__aeabi_i2d>
 c01a9b2:       4602            mov     r2, r0
 c01a9b4:       460b            mov     r3, r1
 c01a9b6:       ec43 2b11       vmov    d1, r2, r3
 c01a9ba:       ed9f 0b07       vldr    d0, [pc, #28]   ; c01a9d8 <hello_main+0x38>
 c01a9be:       f000 f84f       bl      c01aa60 <pow>
 c01a9c2:       ec53 2b10       vmov    r2, r3, d0
 c01a9c6:       4806            ldr     r0, [pc, #24]   ; (c01a9e0 <hello_main+0x40>)
 c01a9c8:       f001 fccc       bl      c01c364 <printf>
 c01a9cc:       2300            movs    r3, #0
 c01a9ce:       4618            mov     r0, r3
 c01a9d0:       b005            add     sp, #20
 c01a9d2:       f85d fb04       ldr.w   pc, [sp], #4
 c01a9d6:       bf00            nop
 c01a9d8:       00000000        .word   0x00000000
 c01a9dc:       407f4000        .word   0x407f4000
 c01a9e0:       0c01e8b8        .word   0x0c01e8b8

@nicolas71640
Copy link
Contributor Author

Updates : (correction).
I'm still investigating. And actually, that's not link to c and c++.
The difference between c and c++ that I have seen was actually due to the used libmath. For c it was the toolchain libmath, and c++ the libmath of nuttx.
So I have the same result in c...

@nicolas71640
Copy link
Contributor Author

Updates :
I have narrowed down the issue. The culprit is the log function.
the libm of nuttx gives : log(500) = 20.000

To reproduct :

  float i = 495;
  printf("log(%f) = %f\n", i, log(i));
  i = 496;
  printf("log(%f) = %f\n", i, log(i));

Gives :

log(495.000000) = 6.272894     // GOOD
log(496.000000) = 16.000032  // WRONG !!

@pkarashchenko pkarashchenko linked a pull request Sep 4, 2024 that will close this issue
@pkarashchenko
Copy link
Contributor

@nicolas71640 please try the linked PR. I hope it should help

@pkarashchenko
Copy link
Contributor

Here's the two assembly code in c++ and c. The c works, the c++ doesn't.

C++

0c01a038 <helloxx_main>:
 c01a038:       b500            push    {lr}
 c01a03a:       b085            sub     sp, #20
 c01a03c:       9001            str     r0, [sp, #4]
 c01a03e:       9100            str     r1, [sp, #0]
 c01a040:       2301            movs    r3, #1
 c01a042:       9303            str     r3, [sp, #12]
 c01a044:       9803            ldr     r0, [sp, #12]
 c01a046:       f7fb fd79       bl      c015b3c <__aeabi_i2d>
 c01a04a:       4602            mov     r2, r0
 c01a04c:       460b            mov     r3, r1
 c01a04e:       ec43 2b11       vmov    d1, r2, r3
 c01a052:       ed9f 0b07       vldr    d0, [pc, #28]   ; c01a070 <helloxx_main+0x38>
 c01a056:       f001 fc03       bl      c01b860 <pow>
 c01a05a:       ec53 2b10       vmov    r2, r3, d0
 c01a05e:       4806            ldr     r0, [pc, #24]   ; (c01a078 <helloxx_main+0x40>)
 c01a060:       f001 f910       bl      c01b284 <printf>
 c01a064:       2300            movs    r3, #0
 c01a066:       4618            mov     r0, r3
 c01a068:       b005            add     sp, #20
 c01a06a:       f85d fb04       ldr.w   pc, [sp], #4
 c01a06e:       bf00            nop
 c01a070:       00000000        .word   0x00000000
 c01a074:       407f4000        .word   0x407f4000
 c01a078:       0c01d8c4        .word   0x0c01d8c4

C

0c01a9a0 <hello_main>:
 c01a9a0:       b500            push    {lr}
 c01a9a2:       b085            sub     sp, #20
 c01a9a4:       9001            str     r0, [sp, #4]
 c01a9a6:       9100            str     r1, [sp, #0]
 c01a9a8:       2301            movs    r3, #1
 c01a9aa:       9303            str     r3, [sp, #12]
 c01a9ac:       9803            ldr     r0, [sp, #12]
 c01a9ae:       f7fb f897       bl      c015ae0 <__aeabi_i2d>
 c01a9b2:       4602            mov     r2, r0
 c01a9b4:       460b            mov     r3, r1
 c01a9b6:       ec43 2b11       vmov    d1, r2, r3
 c01a9ba:       ed9f 0b07       vldr    d0, [pc, #28]   ; c01a9d8 <hello_main+0x38>
 c01a9be:       f000 f84f       bl      c01aa60 <pow>
 c01a9c2:       ec53 2b10       vmov    r2, r3, d0
 c01a9c6:       4806            ldr     r0, [pc, #24]   ; (c01a9e0 <hello_main+0x40>)
 c01a9c8:       f001 fccc       bl      c01c364 <printf>
 c01a9cc:       2300            movs    r3, #0
 c01a9ce:       4618            mov     r0, r3
 c01a9d0:       b005            add     sp, #20
 c01a9d2:       f85d fb04       ldr.w   pc, [sp], #4
 c01a9d6:       bf00            nop
 c01a9d8:       00000000        .word   0x00000000
 c01a9dc:       407f4000        .word   0x407f4000
 c01a9e0:       0c01e8b8        .word   0x0c01e8b8

I do not see a difference in assembly here. However in assembly from example

  int32_t i = 1;
  printf("%f\n", pow(500.f, 1));
  printf("%f\n", pow(500.f, i));

You can see that two calls of printf are emitted, but there is only one call of pow. Compiler injects precalculated value instead of pow(500.f, 1).

@nicolas71640
Copy link
Contributor Author

@pkarashchenko Thank you for your help, but it doesn't help. Still have the log(496.0) = 16.0000....

@pkarashchenko pkarashchenko reopened this Sep 5, 2024
@pkarashchenko
Copy link
Contributor

@pkarashchenko Thank you for your help, but it doesn't help. Still have the log(496.0) = 16.0000....

Strange, as I had the repro rate as you described and it was fixed by this change. Maybe there is something else. I will take a closer look.

Also your message that C works and C++ does not work is something I can't fully understand. Could you please triple check this?

@pkarashchenko
Copy link
Contributor

@nicolas71640 can you increase LOG_MAX_ITER value to 50 (or 100) and check if that gives any positive effect?

@pkarashchenko
Copy link
Contributor

I'm having other change under test. Seems my previous fix is wrong

@pkarashchenko
Copy link
Contributor

@nicolas71640 please check it new change works for you. At least my experiments show positive dynamic in particular cases including your example

@nicolas71640
Copy link
Contributor Author

@pkarashchenko Just checked your PR, it works indeed ! Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: arm Issues related to ARM (32-bit) architecture OS: Linux Issues related to Linux (building system, etc) OS: Other Issues related to other OS (not Linux, Mac or Win) Type: Bug Something isn't working
Projects
None yet
3 participants