-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: Add known issue for TX processor block #20282
Conversation
CI InformationTo view the history of this post, clich the 'edited' button above Inputs:Sources:sdk-nrf: PR head: e67979c7397929c9fff5ac4b381078b10f977733 more detailssdk-nrf:
Github labels
List of changed files detected by CI (1)
Outputs:ToolchainVersion: Test Spec & Results: ✅ Success; ❌ Failure; 🟠 Queued; 🟡 Progress; ◻️ Skipped;
|
You can find the documentation preview for this PR at this link. Note: This comment is automatically posted by the Documentation Publish GitHub Action. |
e986b5a
to
5b29fad
Compare
5b29fad
to
5d688cc
Compare
.. rst-class:: v2-9-0-nRF54H20-1 v2-9-0 v2-8-0 | ||
|
||
NCSDK-31528: Deadlock on sysworkq with ``tx_notify`` in Host | ||
If a blocking callback is run on the system workqueue when the HCI buffer pool is full, the TX processor will be blocked. | ||
This prevents TX buffers from being freed, deadlocking the application. | ||
An example of a blocking callback is calling the :c:func:`bt_conn_get_tx_power_level` function in a receive callback, which is blocking since it uses the :c:func:`bt_hci_cmd_send_sync` function. | ||
|
||
**Workaround:** Do not use blocking calls in the system workqueue. | ||
Alternatively, increase the value of the :kconfig:option:`CONFIG_BT_BUF_CMD_TX_COUNT` Kconfig option to increase the HCI buffer. | ||
This does not guarantee that the problem is solved, as multiple blocking calls may exhaust the buffer pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My two cents to improve clarity:
.. rst-class:: v2-9-0-nRF54H20-1 v2-9-0 v2-8-0 | |
NCSDK-31528: Deadlock on sysworkq with ``tx_notify`` in Host | |
If a blocking callback is run on the system workqueue when the HCI buffer pool is full, the TX processor will be blocked. | |
This prevents TX buffers from being freed, deadlocking the application. | |
An example of a blocking callback is calling the :c:func:`bt_conn_get_tx_power_level` function in a receive callback, which is blocking since it uses the :c:func:`bt_hci_cmd_send_sync` function. | |
**Workaround:** Do not use blocking calls in the system workqueue. | |
Alternatively, increase the value of the :kconfig:option:`CONFIG_BT_BUF_CMD_TX_COUNT` Kconfig option to increase the HCI buffer. | |
This does not guarantee that the problem is solved, as multiple blocking calls may exhaust the buffer pool. | |
NCSDK-31528: Deadlock on sysworkq with ``tx_notify`` in Host | |
If a blocking callback (which in-turn needs to trigger an HCI command to complete its task) is run on the system workqueue when the HCI command buffer pool is full, the TX processing code will get blocked as well. | |
This situation prevents TX command buffers from being freed, deadlocking the application. | |
An example of a blocking callback is calling the :c:func:`bt_conn_get_tx_power_level` function in a receive callback. Calling such function can result in deadlock, since it uses the :c:func:`bt_hci_cmd_send_sync` function to complete its operation. The operation won't complete if HCI command buffer is full resulting in deadlock. | |
**Workaround:** Do not use blocking calls in tasks triggered on the system workqueue. | |
Alternatively, increase the value of the :kconfig:option:`CONFIG_BT_BUF_CMD_TX_COUNT` Kconfig option to increase the HCI buffer. | |
This does not guarantee that the problem is solved, as multiple blocking calls may exhaust the buffer pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember, all sentences on their own lines.
And one more thing, won't -> will not, or even better does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated per your suggestion. I think the last two sentences (before workaround) might be repeating themselves a bit, so suggestions for improvement there is welcome :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be any better if we leave out "resulting in deadlock"? I think that is the redundant part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think so, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would retain that as an explainer for the previous paragraph. The example explains why deadlock happens. May be its ok to leave it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last two sentences are explaining how deadlock occurs in the context of the chosen example. It is easy for us to see it through, but it may not be easy for a typical reader of this text. I suggest to not remove too much for the sake of clarity.
5d688cc
to
9fb6030
Compare
9fb6030
to
424f8e9
Compare
|
||
NCSDK-31528: Deadlock on sysworkq with ``tx_notify`` in Host | ||
If a blocking callback (which in turn needs to trigger an HCI command to complete its task) is run on the system workqueue when the HCI command buffer pool is full, the TX processing code will get blocked. | ||
This prevents TX command buffers from being freed, deadlocking the application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prevents TX command buffers from being freed, deadlocking the application. | |
This situation prevents TX command buffers from being freed, deadlocking the application. |
Otherwise it is not clear what does "this" refers to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's obviously referring to the statement/line above, and adding "situation" doesn't really contribute (to the situation :)).
424f8e9
to
6995f76
Compare
84135b0
to
5cf4f64
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After re-write this is much much clearer. Thanks!
@@ -189,6 +189,26 @@ KRKNWK-14299: NRPA MAC address cannot be set in Zephyr | |||
Bluetooth LE | |||
============ | |||
|
|||
.. rst-class:: v2-9-0-nRF54H20-rc1 v2-9-0 v2-8-0 | |||
|
|||
NCSDK-31528: Blocking the system workqueue if the :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option is disabled can cause a deadlock in the Bluetooth Host when running out of buffers in the HCI commands pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence parsing error. Something went wrong in the rephrasing and now it combination of two sentences with two verbs into single one.
Perhaps?
NCSDK-31528: Blocking the system workqueue if the :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option is disabled can cause a deadlock in the Bluetooth Host when running out of buffers in the HCI commands pool. | |
NCSDK-31528: Potential for deadlock. If the :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option is disabled, blocking of the system workqueue can cause a deadlock in the Bluetooth Host when running out of buffers in the HCI commands pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first line with the Jira issue ID is usually the same as the title in the issue, without a fullstop. Therefore, suggesting:
NCSDK-31528: Deadlock on system workqueue with tx_notify
in host
If the :kconfig:option:CONFIG_BT_HCI_ACL_FLOW_CONTROL
Kconfig option is disabled, blocking of the system workqueue can cause a deadlock in the Bluetooth Host when running out of buffers in the HCI commands pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second line must be indented a couple of spaces (does not seem to show it properly in my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added suggestion!
5cf4f64
to
3aaa324
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with a nit.
* The :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option is disabled. | ||
* The system workqueue is blocked. | ||
* The HCI commands pool is empty. | ||
* A blocking Bluetooth Host API that uses the :c:func:`bt_hci_cmd_send_sync` function is called from any thread (including the system workqueue). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option is disabled. | |
* The system workqueue is blocked. | |
* The HCI commands pool is empty. | |
* A blocking Bluetooth Host API that uses the :c:func:`bt_hci_cmd_send_sync` function is called from any thread (including the system workqueue). | |
* The :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option is disabled. | |
* The system workqueue is blocked. | |
* The HCI commands pool is empty. | |
* A blocking Bluetooth Host API that uses the :c:func:`bt_hci_cmd_send_sync` function is called from any thread (including the system workqueue). |
8e84506
to
b227dda
Compare
Adds a known issue for the TX processor being blocked if a callback is blocking when run in the system workqueue, and the HCI command buffer pool is empty. Signed-off-by: Håvard Reierstad <[email protected]>
b227dda
to
e67979c
Compare
Adds a known issue for the TX processor being blocked if a callback is blocking when run in the system workqueue, and the HCI command buffer pool is empty.