-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect when the firmware responds with the wrong response #241
Detect when the firmware responds with the wrong response #241
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## dev #241 +/- ##
==========================================
+ Coverage 98.52% 98.68% +0.15%
==========================================
Files 7 7
Lines 884 910 +26
==========================================
+ Hits 871 898 +27
+ Misses 13 12 -1 ☔ View full report in Codecov by Sentry. |
It looks like this problem isn't limited to just 2023-12-17 22:04:06.794 DEBUG (MainThread) [zigpy_deconz.api] Command Command.aps_data_confirm (0,)
2023-12-17 22:04:06.795 DEBUG (MainThread) [zigpy_deconz.uart] Send: 0x04c80007000000
2023-12-17 22:04:06.799 DEBUG (MainThread) [zigpy_deconz.uart] Frame received: 0x04c8050800010022
2023-12-17 22:04:06.801 DEBUG (MainThread) [zigpy_deconz.api] Command Command.aps_data_indication (1, <DataIndicationFlags.Always_Use_NWK_Source_Addr: 1>) A seq of To fix this, I'm globally enabling "invalid response" retries for all commands. |
Hi @puddly some background information, this is due the nature of a few async commands. While most commands act in a sync way there is the exception of ZGP and device state "notifications" which can be send by the firmware at any time. This is done so that the host application is notified immediately that new APS indications, APS confirm or other changes are available to poll. In the deCONZ implementation all commands which are send to the firmware are handled in a queue which tracks the sequence number + command id. All incoming commands are checked against this queue. If above mentioned notifications like device state arrive, new respective commands to query APS indications/confirm are scheduled, to speed up everything compared to plain polling. This part is not open source yet but I can provide you the sources if that is of any help. |
Thank you! I think I have this implemented in zigpy-deconz by triggering an event to instantly poll every time From the logs above, it seems like the sequence number desynchronizes, as the response sequence number matches the request sequence number from an unrelated command sent nearly 200ms in the past. Currently, I just endlessly retry if this is the case, as the chances of this occurring are very low. In the future, it would be nice to reserve a sequence number for unsolicited notifications (e.g. 0x00 or 0xFF) and avoid using it during normal communications. Or maybe pick a random sequence number during a notification so that it does not collide with a pending one? But it is good to know that |
Detected with a Raspbee II (0x26780700) (thanks @Citizen-2CB8A24A):
From the log, you can see that the
write_parameter
request doesn't receive the correct response. It's almost as if the enqueueddevice_state_changed
notification overwrites the expectedwrite_parameter
response.A separate exception type is used to detect this rare problem and the watchdog TTL will be re-written until it succeeds.