usb/cdc: fix RP2 USB CDC TX race with cores scheduler#5391
Conversation
ce5c04a to
c059707
Compare
|
@rdon-key thank you for looking into this. Your solution makes sense to me, I just suggested a small correction in form. |
|
@deadprogram |
|
Anyone else have any feedback or comments before merge? |
|
@soypat ? |
| for { | ||
| d1, _ := usbcdc.tx.Peek() | ||
| if len(d1) == 0 { | ||
| usbcdc.txActive.Store(0) | ||
|
|
||
| // Avoid a missed wakeup: Write may append data while txActive is | ||
| // still set, causing kickTx to return without starting a transfer. | ||
| switch { | ||
| case usbcdc.tx.Used() == 0: | ||
| return | ||
| case !usbcdc.txActive.CompareAndSwap(0, 1): | ||
| return | ||
| } | ||
| continue |
There was a problem hiding this comment.
This for loop is confusing to me. Could we make it more explicit?
d1, _ := usbcdc.tx.Peek()
for len(d1)==0 {
// store...
d1, _ = usbcdc.tx.Peek()
}Also, it seems weird to me that we are looping while len of data is zero. If length of data is zero does that not mean data is sent already and you can return?
There was a problem hiding this comment.
Thanks for taking a look.
The outer for is not intended to loop while the ring is empty. It is the main TX pump loop: each iteration peeks the ring and, if data is available, sends one packet.
The len(d1) == 0 branch is only the idle/exit path. It handles the missed-wakeup case: after the peek sees an empty ring, Write may still append data while txActive is set, causing kickTx to return without starting a new transfer. So we clear txActive, then re-check whether data appeared. If it did, we try to re-acquire txActive with CAS and continue the main TX loop so it can re-peek and send normally.
Rewriting this as for len(d1) == 0 would make it look like the code is waiting for data while the ring is empty, which is not the intent.
I've pushed a comment above the loop to make the pump/idle-exit structure explicit — let me know if it's clearer now.
fb52e10 to
c690c2c
Compare
Fixes #5377.
This PR fixes a USB CDC TX race on RP2 targets when using
-scheduler=cores.The issue was reproduced on Raspberry Pi Pico / RP2040 with TinyGo 0.41.1.
RP2350 may be affected as well because it shares the RP2 USB backend and the
same USB CDC TX state machine.
Cause
USB CDC TX is driven from two places:
Write()viakickTx()txhandler()The previous code used
inflightboth as:That is not sufficient on multicore targets.
With
-scheduler=cores,Write()may run on one core while the USB TXcompletion handler runs on another core. In that case,
kickTx()can observe theTX path as idle while
txhandler()is still processing a completed packet andchaining the next one.
That can allow concurrent or re-entrant calls into
sendFromRing()/machine.SendUSBInPacket()for the same USB IN endpoint, which can corrupt USBCDC output.
Fix
This PR adds a separate atomic
txActiveflag.txActiverepresents ownership of the USB CDC TX pump.inflightremains onlythe number of bytes currently submitted to the endpoint.
kickTx()now starts the TX pump only if it can acquiretxActivewith CAS.The TX completion handler keeps TX pump ownership while chaining packets, and
ownership is released only when the TX ring is empty.
A final ring re-check is used when releasing ownership to avoid a missed wakeup
if
Write()appends data whiletxActiveis still set.Manual test
Test program:
Commands:
Results:
The
-scheduler=corestest was repeated multiple times without observingdropped or corrupted USB CDC output.