-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Software crashes with an User exception #9029
Comments
After building another project with similar code and crashes I found out that this was the problem:
Changing this to
did the trick. But it didn't fully help on this project. There I still have a crash after a
It doesn't reach the
|
Memory is corrupted, allocator is configured to panic() when that happens |
Tx for the quick response. Is there a way to figure out the location in the trace? I currently have no clue where to look. I already removed almost everything except the WiFi configuration and the mqtt stuff (over TLS using BearSSL on top of PubSubClient). This is really a pain ... |
Adding |
...or something to do with SDK3? I don't see this one in the default |
Thanks @d-a-v . It crashes exactly at the malloc function (added the flush after the malloc printf / DEBUG_P). Thanks for the gdb hint. Wasn't aware that this is possible with ESP8266 ... but if memory is corrupted I doubt gdb will be of much help. :-( |
I use the Ticker, but only once_scheduled() or attach_ms_scheduled() which should execute the callback in the loop context. I will remove all that code and will see if anything changes. BTW: The crash happens in the PubSubClient MQTT callback message/topic received. But I use this code in other projects in the same way and do not see a crash there. Edit: And for sure PubSubClient isn't bugfree - and unsupported since a long time. I already fixed a bug in context with PROGMEM data
to (as I have seen crashes at strnlen())
|
Removing all the Ticker code didn't help at all. Dump output looks almost the same.
garage-arduino.cpp _Pri_3_Stack lines mentioned above:
|
...I meant that this shows up b/c of enabled SDK3, apparently. SDK2 190703 does not have this symbol in .a, SDK305 does. platformio.ini options weren't mentioned above |
@mcspr You should have access to Yes, I use SDK305. I'll see if it compiles with SDK2 and report. Edit: I used [env:upstream_develop_flash_serial_debug] target here ... |
I must be doing something wrong, it is still 404 for me 🤷 |
Oh sorry. I added additional access for you. Hope it works now. In the meantime I tried SDK 2.x and code crashed even earlier. Maybe I have to remove more code to narrow down the issue. It is really strange that some projects of mine work and others don't. Even that the code is almost the same now.
(and of course I erased flash before downgrading of the SDK ...) |
Watch out for unchecked (s)printf diff --git a/src/garage-arduino.cpp b/src/garage-arduino.cpp
index 85602ae..3088409 100644
--- a/src/garage-arduino.cpp
+++ b/src/garage-arduino.cpp
@@ -419,7 +420,7 @@ void mqttSetup() {
INFOF_P("%s: MQTT client id: %s" LF, __func__, host_name);
SAFE_FREE(m_homa_id);
- m_homa_id = (char *)malloc(strlen_P(PSTR(HOMA_SYSTEM_ID) + 1));
+ m_homa_id = (char *)malloc(strlen_P(PSTR(HOMA_SYSTEM_ID)) + 1);
sprintf_P(m_homa_id, PSTR(HOMA_SYSTEM_ID)); //, ESP.getChipId());
INFOF_P("%s: HomA system id: %s" LF, __func__, m_homa_id);
For example,
(gdb) x/16xb 0x3fff2ba6 - 8
0x3fff2b9e: 0x00 0x00 0xa5 0xa5 0xa5 0xa5 0x31 0x32
0x3fff2ba6: 0x33 0x00 0xa5 0xa5 0x00 0x00 0x00 0x00 Looks like (gdb) p m_homa_id
$1 = 0x3fff2ba4 "123"
(gdb) x/16xb (uint8_t*)m_homa_id - 4
0x3fff2ba0: 0xa5 0xa5 0xa5 0xa5 0x31 0x32 0x33 0x00
0x3fff2ba8: 0xa5 0xa5 0x00 0x00 0x00 0x00 0x00 0x00 |
@mcspr you are incredible! Thank you sooo much! What a stupid mistake of mine. Shame on me. I owe you something.
to
What a bad idea to write the +1 at the end at the slightly wrong position. But good to know how to interpret the 'poison' bytes. Is there a documentation somewhere (tried to find something without success)? And good to know how to use the debugger here. Thanks. Thanks. Thanks. A last question is still there. Is it possible to set another reset reason? This 'panic' resets the ESP and after reboot |
Only through reading the umm code and its comments. I think the example above could be a good addition to the gdb (https://github.com/esp8266/Arduino/blob/master/doc/gdb.rst) or faq (https://github.com/esp8266/Arduino/blob/master/doc/faq/a02-my-esp-crashes.rst , where it would corrupt some memory on purpose with an expected result
Something overlooked. iirc, reset reason is passed through the 'custom callback' correctly but is not preserved during reboot as we read from SDK and SDK does not know of this special case 'Exception' as the term used here would mean Xtensa hw 'General Exception Causes' Value is also not preserved during the reboot correctly for this case, as it is only for display and not really an 'Exception' at all |
Basic Infos
Platform
Settings in IDE
Problem Description
Software crashes with an User exception after connecting to WiFi and MQTT TLS
But it is not always at the same point in software. Sometimes it also starts the HTTP server and transmits several packages before the crash, like this:
I have no idea where to search for the problem. Can someone give me a hint (looking at the dump)? There seems to be enough heap available and fragmentation changes between 3 and 19 in the example above but does not constantly grow.
Another this that's strange to me is that
ESP.getResetReason()
outputsSoftware/System restart
after that. For me this looks like an Exception and not like a "normal" Software/System restart ...MCVE Sketch
Will follow, if needed.
Debug Messages
The text was updated successfully, but these errors were encountered: