tests: handle spurious EWOULDBLOCK in io_async_fd #6776

hawkw · 2024-08-14T19:15:17Z

Motivation

The io_async_fd.rs tests contain a drain() function, which currently performs synchronous reads from a UDS socket until it returns io::ErrorKind::WouldBlock (i.e., errno EWOULDBLOCK/EAGAIN). The intent behind this function is to ensure that all data has been drained from the UDS socket's buffer...which is what it appears to do...on Linux. On other systems, it appears that an EWOULDBLOCK or EAGAIN may be returned before enough data has been read from the UDS socket to result in the other end being notified that the socket is now writable. In particular, this appears to be the case on illumos, where the tests using this function hang forever (see this comment on PR #6769).

To my knowledge, this behavior is still POSIX-compliant --- the reader will still be notified that the socket is readable, and if it were actually doing non-blocking IO, it would continue reading upon receipt of that notification. So, relying on EWOULDBLOCK to indicate that the socket has been sufficiently drained appears to rely on Linux/FreeBSD behavior that isn't necessarily portable to other Unices.

Solution

This commit changes the drain() function to take an argument for the number of bytes written to the socket previously, and continue looping until it has read that many bytes, regardless of whether EWOULDBLOCK is returned. This should ensure that the socket is drained on all POSIX-compliant systems, and indeed, the io_async_fd::reset_writable and io_async_fd::poll_fns tests no longer hang forever on illumos.

I think making this change is an appropriate solution to the test failure here, as the drain() function is part of the test, rather than the code in Tokio being tested, and (as I mentioned above) the use of blocking reads on a non-blocking socket without a mechanism to continue reading when the socket becomes readable again is not really something a real life program seems likely to do. Ensuring that all the written bytes have been read by passing in a byte count seems more faithful to what the test is actually trying to do here, anyway.

Thanks to @jclulow for debugging what was going on here!

This change was cherry-picked from commit
f18d6ed from PR #6769, so that the fix can be merged separately.

@jclulow

## Motivation The `io_async_fd.rs` tests contain a `drain()` function, which currently performs synchronous reads from a UDS socket until it returns `io::ErrorKind::WouldBlock` (i.e., errno `EWOULDBLOCK`/`EAGAIN`). The *intent* behind this function is to ensure that all data has been drained from the UDS socket's buffer...which is what it appears to do...on Linux. On other systems, it appears that an `EWOULDBLOCK` or `EAGAIN` may be returned before enough data has been read from the UDS socket to result in the other end being notified that the socket is now writable. In particular, this appears to be the case on illumos, where the tests using this function hang forever (see [this comment][1] on PR #6769). To my knowledge, this behavior is still POSIX-compliant --- the reader will still be notified that the socket is readable, and if it were actually doing non-blocking IO, it would continue reading upon receipt of that notification. So, relying on `EWOULDBLOCK` to indicate that the socket has been sufficiently drained appears to rely on Linux/FreeBSD behavior that isn't necessarily portable to other Unices. ## Solution This commit changes the `drain()` function to take an argument for the number of bytes *written* to the socket previously, and continue looping until it has read that many bytes, regardless of whether `EWOULDBLOCK` is returned. This should ensure that the socket is drained on all POSIX-compliant systems, and indeed, the `io_async_fd::reset_writable` and `io_async_fd::poll_fns` tests no longer hang forever on illumos. I think making this change is an appropriate solution to the test failure here, as the `drain()` function is part of the test, rather than the code in Tokio *being* tested, and (as I mentioned above) the use of blocking reads on a non-blocking socket without a mechanism to continue reading when the socket becomes readable again is not really something a real life program seems likely to do. Ensuring that all the written bytes have been read by passing in a byte count seems more faithful to what the test is actually *trying* to do here, anyway. Thanks to @jclulow for debugging what was going on here! This change was cherry-picked from commit f18d6ed from PR #6769, so that the fix can be merged separately. [1]: #6769 (comment)

Darksonn

Overall LGTM. One nit.

tokio/tests/io_async_fd.rs

Co-authored-by: Alice Ryhl <[email protected]>

hawkw mentioned this pull request Aug 14, 2024

add a basic illumos test job #6769

Draft

hawkw self-assigned this Aug 14, 2024

hawkw added the A-tokio Area: The main tokio crate label Aug 14, 2024

hawkw requested a review from Darksonn August 14, 2024 19:24

Darksonn added the M-io Module: tokio/io label Aug 15, 2024

Darksonn approved these changes Aug 15, 2024

View reviewed changes

tokio/tests/io_async_fd.rs Outdated Show resolved Hide resolved

Update tokio/tests/io_async_fd.rs

d26ff7c

Co-authored-by: Alice Ryhl <[email protected]>

hawkw enabled auto-merge (squash) August 15, 2024 15:33

hawkw merged commit 2d697fc into master Aug 15, 2024
81 checks passed

hawkw deleted the eliza/actually-drain branch August 15, 2024 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

tests: handle spurious EWOULDBLOCK in io_async_fd #6776

tests: handle spurious EWOULDBLOCK in io_async_fd #6776

Uh oh!

hawkw commented Aug 14, 2024

Uh oh!

Darksonn left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tests: handle spurious EWOULDBLOCK in io_async_fd #6776

tests: handle spurious EWOULDBLOCK in io_async_fd #6776

Uh oh!

Conversation

hawkw commented Aug 14, 2024

Motivation

Solution

Uh oh!

Darksonn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!