Skip to content

[local server] Race Condition NIOAsyncWriter Deinited Without Calling finish() #635

@sebsto

Description

@sebsto

Intermittent crash in LambdaHTTPServer.handleConnection() with fatal error: Deinited NIOAsyncWriter without calling finish(). This is a race condition caused by manually closing the channel in a cancellation handler while executeThenClose is still executing.

Environment

  • Platform: Linux (Red Hat Enterprise Linux 10.0)
  • Swift Version: 6.x
  • Test: testLocalServerCustomPort() in LambdaLocalServerTests.swift
  • Frequency: Intermittent, more likely on fast multi-core machines under rapid test execution

Stack Trace

NIOCore/NIOAsyncWriter.swift:177: Fatal error: Deinited NIOAsyncWriter without calling finish()

*** Signal 4: Backtracing from 0x7fc8e5d98718... done ***
*** Program crashed: Illegal instruction at 0x00007fc8e5d98718 ***

Full backtrace shows crash occurring during test execution of "Local server respects LOCAL_LAMBDA_PORT environment variable".

Root Cause

The issue is in Sources/AWSLambdaRuntime/HTTPServer/Lambda+LocalServer.swift, in the handleConnection method (lines 234-268):

private func handleConnection(
    channel: NIOAsyncChannel<HTTPServerRequestPart, HTTPServerResponsePart>,
    logger: Logger
) async {
    // ...
    await withTaskCancellationHandler {
        do {
            try await channel.executeThenClose { inbound, outbound in
                // Process requests...
            }
        } catch let error as CancellationError {
            logger.trace("The task was cancelled", metadata: ["error": "\(error)"])
        } catch {
            logger.error("Hit error: \(error)")
        }
    } onCancel: {
        channel.channel.close(promise: nil)  // ⚠️ RACE CONDITION
    }
}

The Race Condition

  1. When task cancellation occurs, the onCancel handler fires and calls channel.channel.close(promise: nil)
  2. Simultaneously, executeThenClose may still be executing and managing the channel lifecycle
  3. The manual close races with executeThenClose's automatic cleanup
  4. The channel gets closed before executeThenClose can properly call finish() on the internal NIOAsyncWriter
  5. The NIOAsyncChannel is deallocated with an unfinished writer, triggering the fatal error

Why It's Intermittent

The race window is timing-dependent:

  • Only manifests when cancellation happens while executeThenClose is still processing
  • More likely on fast machines with many cores where concurrent operations overlap
  • Depends on exact timing of task cancellation relative to channel operations

Impact

  • Causes test failures with fatal errors
  • Potential production issue if server shutdown occurs during active request processing
  • Violates NIOAsyncChannel's contract that writers must be finished before deallocation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions