Skip to content

fix: clear stale turn state and unblock stuck composer sends#1715

Open
justsomelegs wants to merge 4 commits intopingdotgg:mainfrom
justsomelegs:fix-1048-stuck-waiting-composer
Open

fix: clear stale turn state and unblock stuck composer sends#1715
justsomelegs wants to merge 4 commits intopingdotgg:mainfrom
justsomelegs:fix-1048-stuck-waiting-composer

Conversation

@justsomelegs
Copy link
Copy Markdown

@justsomelegs justsomelegs commented Apr 3, 2026

What Changed

Fixes #1048, #593.

This PR fixes the stuck waiting/Sending state in two small steps:

  1. It fixes stale orchestration session state on the server.
    When provider runtime lifecycle events transition a session back out of running, we now clear any stale activeTurnId instead of carrying it forward. The lifecycle guard also only treats activeTurnId as authoritative while the session is actually running.

  2. It makes the web composer acknowledge sends by committed orchestration sequence instead of inferring acknowledgement from thread/session timestamps.
    The local send latch now clears when the client has applied the dispatchCommand(...).sequence for the send, and snapshot sync keeps that applied-sequence value monotonic.

There is also a small reconnect safeguard:

  • the welcome payload now includes a per-process serverInstanceId
  • if the web app receives a later welcome from a different server instance, it reloads to avoid carrying stale in-memory orchestration cursor state across a server restart

Why

The issue had two contributing pieces:

  • On the server, runtime lifecycle state could end up inconsistent: a session could return to ready while still retaining a stale activeTurnId. Later turn lifecycle events could then be treated as conflicting/stale.
  • On the web side, the composer’s local “send in progress” latch relied on inferred thread/session field changes. If the thread advanced without those exact fields changing in the expected shape, the UI could stay stuck in Sending and block another message until refresh or thread switch.

This approach keeps the fix focused:

  • server side: restore a coherent session invariant
  • client side: acknowledge sends using the explicit committed orchestration sequence
  • reconnect safety: reload if the server instance changes underneath the same tab

That is more robust than the old timestamp-based inference, while still keeping the PR scoped to a bug fix.

Checklist

  • This PR is small and focused
  • I explained what changed and why
  • I included before/after screenshots for any UI changes
  • I included a video for animation/interaction changes

For the last two items: there are no meaningful UI visuals to screenshot here. This is a state-management/reliability fix rather than a presentational UI change.

Note

Fix stuck composer by unblocking sends via orchestration sequence acknowledgment

  • Replaces turn/session field comparison in hasServerAcknowledgedLocalDispatch with an orchestration sequence check: the composer unblocks when latestAppliedOrchestrationSequence meets or exceeds the ackSequence returned from the server after dispatch.
  • Clears stale activeTurnId in ProviderRuntimeIngestion whenever the session transitions to a non-running state (ready, error, stopped), not just on turn.completed or session.exited.
  • Adds serverInstanceId (random UUID) to welcome payloads and triggers a full page reload in the client when a welcome arrives from a different server instance after initial connection.
  • Tracks latestAppliedOrchestrationSequence in AppState, updated on every applied orchestration event and read model sync.
  • Behavioral Change: composer busy state now gates on sequence acknowledgment rather than turn/session diffs, so sends unblock even without a clean running lifecycle.

Macroscope summarized 927191a.

@github-actions github-actions bot added size:L 100-499 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list. labels Apr 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d3f6d693-f2bd-4dcf-938a-fc7e26f1b429

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Apr 3, 2026

Approvability

Verdict: Needs human review

This bug fix changes fundamental client-server state synchronization patterns, replacing timestamp-based acknowledgment with sequence-based tracking, and adds server restart detection that triggers page reloads. While well-tested, the scope of runtime behavior changes to core state management warrants human review.

You can customize Macroscope's approvability policy. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L 100-499 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Threads get stuck on "waiting for 0s"

1 participant