Skip to content

Fix use-after-free when Windows monitor fails with pending I/O#340

Open
broken-circle wants to merge 2 commits into
swiftlang:mainfrom
broken-circle:asyncio-fail-windows
Open

Fix use-after-free when Windows monitor fails with pending I/O#340
broken-circle wants to merge 2 commits into
swiftlang:mainfrom
broken-circle:asyncio-fail-windows

Conversation

@broken-circle

Copy link
Copy Markdown
Member

Follow-up to #325 and #336, which fixed the same class of lifetime bug on the Windows read and write cancellation paths. This PR fixes the same bug on the branch where the stream throws an error.

On Windows, issueAndAwaitRead() and write(_:to:for:) await an overlapped ReadFile()/WriteFile() completion through the signal stream. When the stream finishes without delivering that completion, both reconcile the pending operation before its buffer is freed. When the stream throws an error, the catch branch rethrows with the operation still pending against the caller's buffer: a pending ReadFile() is the kernel writing into resultBuffer, and a pending WriteFile() is the kernel reading from the caller's borrowed span and poised to write into the stack overlapped. Both are released as the frame unwinds, so the kernel can touch freed storage.

This PR routes both catch branches through settlePendingOverlapped() before rethrowing, so a pending operation is cancelled with CancelIoEx() and waited on with GetOverlappedResult() before its buffer goes out of scope. On the thrown-error path the reconciliation provides only lifetime safety; the returned bytes are discarded. The previous reconcile* methods are folded into the new method.

The stream throws only when the IOCP monitor finishes every continuation with an error, which it does on a fatal GetQueuedCompletionStatus() failure that's neither ERROR_BROKEN_PIPE nor ERROR_OPERATION_ABORTED. That same failure kills the monitor, so it's already gone by the time the reconciliation runs. The kernel still sets the file handle's signal on completion independently of any IOCP dequeue, so the wait resolves without it; the completion port stays open (only shutdown() closes it), so when the operation is still pending the packet the abort enqueues is never dequeued, producing a one-packet leak on an already-failed subsystem.

On non-Windows platforms, the read and write paths issue a synchronous read()/write() and retain no reference to the buffer across the await. The stream can throw, but no operation is pending against a buffer when it does, so there's nothing to reconcile.

Testing and documentation

As with #336, this bug produces no observable behavioral difference to assert against: the returned count and captured output are identical with or without the fix, and the defect would surface only as memory corruption under a checking allocator whose heap reuse lands in the freed window. Forcing the interleaving requires injecting a GetQueuedCompletionStatus() failure while an overlapped operation is parked, which has no entry point without a hook in the monitor thread (which isn't suitable to ship, as discussed in #325).

Related documentation:

The application must not free or reuse the OVERLAPPED structure associated with the canceled I/O operations until they have completed. The thread can use the GetOverlappedResult function to determine when the I/O operations themselves have been completed.

For asynchronous operations still pending, the cancel operation will queue an I/O completion packet.

If the hEvent member of the OVERLAPPED structure is NULL, the system uses the state of the hFile handle to signal when the operation has been completed. Use of file, named pipe, or communications-device handles for this purpose is discouraged. It is safer to use an event object because of the confusion that can occur when multiple simultaneous overlapped operations are performed on the same file, named pipe, or communications device. In this situation, there is no way to know which operation caused the object's state to be signaled.

Each time an I/O operation is started, the operating system sets the file handle to the nonsignaled state. Each time an I/O operation is completed, the operating system sets the file handle to the signaled state.

The `reconcile*` methods coded the same general sequence. Replace both
with `settlePendingOverlapped()`. The read `nil`-branch surfaces a
nonzero return as `.read` and otherwise falls through to the drain,
matching every arm of the former read reconcile; the write `nil`-branch
discards the count, matching the former write reconcile. No behavior
change.
The read and write cancellation paths reconciled a pending
ReadFile/WriteFile before its buffer was freed, but only on the
signal stream's finished (`nil`) branch. When the stream instead threw
an error, the catch branch rethrew it with a ReadFile/WriteFile still
pending against the caller's buffer, releasing it while the kernel
could still read or write it.

Route both catch branches through `settlePendingOverlapped()` before
rethrowing, so the operation is cancelled and waited on before the
buffer goes out of scope.
@broken-circle broken-circle marked this pull request as ready for review July 2, 2026 21:59
@broken-circle broken-circle requested a review from iCharlesHu as a code owner July 2, 2026 21:59
@jakepetroules

Copy link
Copy Markdown
Contributor

Great to see! Are there any additional tests you can think of adding which would have exposed some of these last few Windows issues you fixed? I understand it might be difficult due to the non-deterministic nature but thought I'd ask anyways.

@broken-circle

Copy link
Copy Markdown
Member Author

Thanks! I'll do some digging and follow up likely sometime tomorrow. The read (#325) and write (#336) bugs are races between an operation completing and a child terminating. I want to check whether that interleaving can be forced by controlling the child's lifecycle and the pipe's state without instrumenting the monitor.

The catch-path bug in this PR triggers on an internal completion-port failure that nothing in the public API could induce. I think that one can't be structurally forced without a fault-injection hook in the monitor's loop, which we probably wouldn't want to ship.

@broken-circle

broken-circle commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

I thought about this and ran some tests, and reached the same conclusion as before: there are no meaningful regression tests we can add for these three fixes, on reachability and observability grounds. Overall, all three are the same shape: a branch that should call settlePendingOverlapped() but didn't. The branches are reached only under a race (#325/#336) or through a non-inducible internal failure (this PR).

The read branch (#325) is reachable but not observable. The failure is a completed read dropped before its yielded count is consumed, and observing that requires catching the narrow window between the kernel completing the read and the monitor yielding the count. The competing event is: the subprocess exits and propagates through waitForProcessTermination()cancelAll() → cancellation reaching the parked next(). The monitor's yield almost always wins against that long async chain, so the bug lives in the rare inversion. Nothing outside the monitor can lengthen that window; we'd have to use a hook to delay the monitor. Saturating the monitor thread with many concurrent subprocesses is one non-instrumenting idea, but it can't bind a specific completion's delay to a specific subprocess's exit, so it lowers the flake rate without ever being deterministic.

The write branch (#336) is also reachable but not observable. The failure is a subprocess terminating mid-write; the bytes were destined for a pipe whose reader is already gone, and the returned count is identical with or without the fix, so there's nothing to assert. The only trace is the lifetime violation, and no sanitizer in this CI sees it: ASan/TSan instrument userland accesses, whereas this is the kernel completing the write into freed storage (the borrowed span and the stack overlapped). Native heap verification could target this, but that reaches only heap-backed cases like #325's read buffer, not a stack OVERLAPPED; even then it surfaces as a process crash, not an assertion. An exit test using #expect(processExitsWith:) can capture such a crash, but it needs a deterministic crash on which to assert. A post-fix exit test would only observe normal termination, and couldn't distinguish a non-use-after-free run from one that just missed the window.

The catch branch (this PR) is reachable only through a failure that nothing in the public API can induce. I confirmed by test that a terminated subprocess with a parked write completes with ERROR_BROKEN_PIPE; the monitor special-cases that error and continues past it, so signalStream.next() returns nil rather than throwing (the #336 nil-branch). Reaching the catch requires the monitor to throw an error other than ERROR_BROKEN_PIPE or ERROR_OPERATION_ABORTED. For an exit test to reach this branch, we'd need to inject a GetQueuedCompletionStatus() fault through a monitor hook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants