Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP_MultiHeap: initialize only if heap allocation succeeded #29005

Merged
merged 3 commits into from
Jan 4, 2025

Conversation

srmainwaring
Copy link
Contributor

Ensure new_heap is not null before accessing, and initialise current heap usage.

Motivation

  • On ESP32 S3 the heap allocator is failing when attempting to create a new Lua state.
  • The variable current_heap_usage is not initialised to zero.

@tpwrules
Copy link
Contributor

tpwrules commented Jan 4, 2025

What exactly is failing? That malloc call to initialize the heap?

Moving the magic init is correct (poor review on my part, whoops) and probably needs a backport.

But Theoretically™ malloc returns zero-initialized memory on ArduPilot so the clear of the used size is not necessary. Have you seen that be non-zero?

@srmainwaring
Copy link
Contributor Author

srmainwaring commented Jan 4, 2025

But Theoretically™ malloc returns zero-initialized memory on ArduPilot so the clear of the used size is not necessary. Have you seen that be non-zero?

Yes. The non-zero is the reason the allocator is failing on ESP32. The first pass through the allocator it fails at

if (heapp->current_heap_usage + size > heapp->max_heap_size) {

Can post a full stack trace if evidence required - but I've seen this fail on ESP32 classic and the S3.

@tpwrules
Copy link
Contributor

tpwrules commented Jan 4, 2025

The failure matches my experience, and it makes sense that the heap malloc itself is not failing because then it should crash with a null pointer dereference.

But that's bad, it means the wrapping is not working. Let me see if there's an obvious cause.

@srmainwaring
Copy link
Contributor Author

I don't see anything in https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_Common/c%2B%2B.cpp that would wrap malloc for ESP32 to ensure memory is initialised to zero. Where is that meant to occur for ESP32?

@tpwrules
Copy link
Contributor

tpwrules commented Jan 4, 2025

It is supposed to be done here but I suspect that doesn't get passed through to the linker because that is done by ESP-IDF:

env.LINKFLAGS += ['-Wl,--wrap,malloc']

Adding target_link_options(${elf_file} PRIVATE "-Wl,--wrap,malloc") after the idf_build_executable in CMakeLists.txt fixes it, or at least I can see __wrap_malloc in the binary now.

We really need to convince the team to switch to a different function name, this has been repeatedly shown to not be safe.

@tpwrules tpwrules force-pushed the prs/pr-multiheap-init branch from 1f5cfc1 to 729333e Compare January 4, 2025 22:02
@tpwrules
Copy link
Contributor

tpwrules commented Jan 4, 2025

I have adjusted to include this fix, please test that it works properly.

@tpwrules tpwrules changed the title AP_MultiHeap: ensure new heap initialised AP_MultiHeap: initialize only if heap allocation succeeded Jan 4, 2025
The relevant linker flag needed to be put in the CMake script.
The relevant linker flag needed to be put in the CMake script.
@tpwrules tpwrules force-pushed the prs/pr-multiheap-init branch from 729333e to d9e5f2d Compare January 4, 2025 22:11
@srmainwaring
Copy link
Contributor Author

I have adjusted to include this fix, please test that it works properly.

Thanks - looks good. Re-tested a hello world script on ESP32 S3.

esp32_scripting_v2

@peterbarker peterbarker merged commit 2d98657 into ArduPilot:master Jan 4, 2025
99 checks passed
@srmainwaring srmainwaring deleted the prs/pr-multiheap-init branch January 4, 2025 23:31
@davidbuzz davidbuzz added the esp32 label Jan 6, 2025
@davidbuzz
Copy link
Collaborator

davidbuzz commented Jan 6, 2025

this PR should perhaps have also had the prefix AP_HAL_ESP32 now that its been determined to be a missing --wrap on malloc in the CMake linker that was exposing the issue on esp32, but its a great find, and brings the esp32 builds closer to how chibios does its pre-zero-ing so its a good-thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Pending
Development

Successfully merging this pull request may close these issues.

4 participants