Skip to content

Reconnect REPLAY actions are not fanned out via host events() #125

@colbylwilliams

Description

@colbylwilliams

Both the Rust ahp::hosts (#121) and Swift MultiHostClient (#124) multi-host runtimes have a gap on reconnect: when the server's reconnect response uses the replay arm (returning the missed ActionEnvelope[]), the replayed envelopes are not fanned out through the per-host events() stream.

Reproduction

  1. Subscribe to multi.events() (Rust) or multi.events() (Swift).
  2. Add a host. Observe live actions stream through the events tap with proper host_id / resource tags.
  3. Force a transport drop (close the socket on the server side).
  4. While disconnected, the server appends actions to its log.
  5. Allow the runtime to reconnect. The server returns a reconnect result with type: 'replay' and an actions: ActionEnvelope[] carrying the missed envelopes.
  6. The runtime applies the replay to its internal cursor (serverSeq is bumped) but does not push the envelopes through the same fan-out used for live actions.
  7. Any consumer relying on events() to drive UI state, derive aggregated views, or feed reducers silently misses every action that happened during the disconnect window.

Why it bites

  • aggregatedSessions() / aggregatedAgents() only reflect what the per-host root-state mirror and session-summary cache see. Root actions in the replay (e.g. root/agentsChanged, root/activeSessionsChanged) update the mirror as a side effect of apply_action_to_root on the runtime's internal handler — but only because the runtime's event handler is what calls the reducer. If the replayed envelopes never reach handle_event, the reducer is never called, and aggregates stay stale until the next live action triggers a re-read.
  • Per-resource consumers (anyone calling client.subscribe(uri) or client.attachSubscription(uri)) miss replayed actions for their URI even more visibly: the replay arm is the one path designed to deliver them, and it's currently swallowed.

Fix sketch

In each runtime's connect_once / equivalent, after the reconnect handshake succeeds with type: 'replay':

  1. For each replayed ActionEnvelope, route it through the same code path that handle_event uses for live envelopes — both the per-host state mirror update and the fan-out to events() / per-URI subscriptions.
  2. Make sure ordering is preserved (replayed envelopes must be delivered in serverSeq order, before any live envelope arriving on the new connection).
  3. The snapshot arm (when the server gives up on replay and returns a fresh snapshot) also needs to deliver any per-URI snapshots through the events stream so consumers can reset their reducers.

Scope

  • Rust: clients/rust/crates/ahp/src/hosts/runtime.rs — the branch in connect_once that calls client.reconnect(...).
  • Swift: clients/swift/AgentHostProtocol/Sources/AgentHostProtocolClient/Hosts/HostRuntime.swift — the equivalent branch in the supervisor.

Both SDKs should grow the same regression test: an in-memory server that drops the connection mid-flight, accepts the reconnect with a non-empty actions: [] replay, and asserts the consumer's events() cursor sees those envelopes (in order, with the right host_id / resource tags) before any subsequent live envelope.

Discovered during

Swift MultiHostClient review (#124).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue identified by VS Code Team member as probable bug

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions