Skip to content

fix: drop orphaned tool results on session resume#3001

Merged
dgageot merged 1 commit into
mainfrom
fix/session-resume-orphaned-tool-result
Jun 5, 2026
Merged

fix: drop orphaned tool results on session resume#3001
dgageot merged 1 commit into
mainfrom
fix/session-resume-orphaned-tool-result

Conversation

@Sayt-0
Copy link
Copy Markdown
Member

@Sayt-0 Sayt-0 commented Jun 4, 2026

Problem

Resuming a session via --session / /sessions immediately fails on AWS Bedrock with ValidationException: The number of toolResult blocks ... exceeds the number of toolUse blocks of previous turn.

Root cause

When a session is compacted, the kept-tail boundary (FirstKeptEntry) can land between an assistant tool_use and its tool_result. On resume, buildSessionSummaryMessages emits the conversation from that boundary, so the reconstructed history starts with an orphaned tool_result (no matching tool_use). sanitizeToolCalls — the last guard before messages reach the provider — only injected synthetic results for dangling tool calls and always forwarded tool messages, so orphans passed straight through. OpenAI-compatible providers (litellm) forward the invalid request; Bedrock rejects it.

Fix

sanitizeToolCalls now also drops tool results whose ToolCallID has no matching tool_use in the preceding assistant message, keeping the toolUse/toolResult invariant balanced in both directions.

Tests

  • TestSanitizeToolCalls_DropsOrphanedToolResult — unit guard.
  • TestGetMessages_ResumeAfterCompaction_NoOrphanedToolResult — full GetMessages resume path with a summary boundary splitting a tool pair.
  • TestSanitizeToolCalls_KeepsMissingResultBalanced — pins existing dangling-tool-call behaviour.

Both new tests fail without the fix and pass with it.

Fixes #1676

Compaction's kept-tail boundary (FirstKeptEntry) could fall between an assistant tool_use and its tool_result. On resume, the reconstructed history then started with an orphaned tool_result whose tool_use had been folded into the summary. sanitizeToolCalls only injected synthetic results for dangling tool_use blocks; it never dropped orphaned tool_results, so the invalid history reached the provider and AWS Bedrock rejected it (ValidationException: toolResult blocks exceed toolUse blocks).

sanitizeToolCalls now also drops tool results with no matching tool_use in the preceding assistant message, keeping the invariant balanced in both directions.

Fixes #1676
@Sayt-0 Sayt-0 requested a review from a team as a code owner June 4, 2026 08:50
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The fix correctly addresses the Bedrock ValidationException by dropping orphaned tool results in sanitizeToolCalls. The logic is sound: pendingIDs is populated when an assistant message with tool calls is processed and reset on flushPending(), so each turn's scope is correctly isolated. The three new tests provide good coverage of the primary fix path and the existing dangling-call direction.

One minor behavioral change is noted inline.

Comment thread pkg/session/session.go
// Drop orphaned tool results: a tool_result with no matching
// tool_use in the preceding assistant message violates the
// provider contract.
if msg.ToolCallID == "" || !pendingIDs[msg.ToolCallID] {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Behavioral change: tool results with empty ToolCallID are now silently dropped

Under the old logic the condition was if msg.ToolCallID != "" which only gated the resultIDs book-keeping — a tool-result with an empty ID still fell through to out = append(out, msg) and was forwarded to the provider. Under the new condition:

if msg.ToolCallID == "" || !pendingIDs[msg.ToolCallID] {
    continue   // drops the message
}

an empty-ID tool result is now silently dropped. In practice this is likely fine — a tool result with no ID cannot be matched to any tool_use and most providers mandate non-empty IDs — but it is a silent behavioral regression from the previous behavior. If any in-flight session persisted with empty-ID tool results they would be discarded on the next resume.

Consider documenting the intentional drop (e.g., // empty ToolCallID is malformed; drop it) so the behavior is explicit rather than a side-effect of the orphan check.

@aheritier aheritier added area/providers/bedrock AWS Bedrock provider support area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/fix PR fixes a bug (maps to fix: commit prefix) labels Jun 4, 2026
@dgageot dgageot merged commit 03386b3 into main Jun 5, 2026
13 checks passed
@Sayt-0 Sayt-0 deleted the fix/session-resume-orphaned-tool-result branch June 5, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/providers/bedrock AWS Bedrock provider support area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/fix PR fixes a bug (maps to fix: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session resume via /sessions immediately fails with Bedrock ValidationException (toolResult/toolUse mismatch)

4 participants