Skip to content

fix: keep OpenAI reasoning responses stateful#4557

Merged
balegas merged 2 commits into
mainfrom
fix/openai-responses-store-true
Jun 11, 2026
Merged

fix: keep OpenAI reasoning responses stateful#4557
balegas merged 2 commits into
mainfrom
fix/openai-responses-store-true

Conversation

@KyleAMathews

@KyleAMathews KyleAMathews commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Executive Summary

Built-in OpenAI/OpenAI Codex reasoning model payloads now force store: true so OpenAI Responses continuations can replay reasoning/tool-call state reliably. This prevents follow-up agent steps from failing with missing rs_* reasoning item errors.

Root Cause

OpenAI Responses reasoning/tool-call continuations can include prior rs_* reasoning items in subsequent requests. The upstream Responses payload defaults to store: false, which means OpenAI does not persist those items server-side. When a later request references one of those non-persisted item ids, OpenAI can return:

Item with id ... not found

Approach

When applying built-in provider payload defaults for OpenAI reasoning models, keep the existing reasoning-effort normalization and also force the Responses payload to be stateful:

return {
  ...body,
  store: true,
  reasoning: {
    ...existingReasoning,
    effort,
  },
}

The test coverage verifies that:

  • OpenAI reasoning payloads get store: true.
  • An incoming store: false is intentionally overridden.
  • Non-OpenAI providers, such as DeepSeek, do not receive this payload override.

Key Invariants

  • Only built-in OpenAI/OpenAI Codex reasoning model payload defaults should force store: true.
  • Existing payload fields should be preserved unless intentionally overridden.
  • Existing reasoning effort normalization should continue to apply.

Non-goals

  • This does not change non-OpenAI provider behavior.
  • This does not implement a stateless/ZDR-compatible encrypted reasoning replay path.
  • This does not patch @mariozechner/pi-ai; it applies the built-in agent payload default at our integration layer.

Trade-offs

The alternative is to keep store: false and preserve/replay full reasoning.encrypted_content for stateless operation. That is more complex and unnecessary for the built-in agent default path. Using store: true matches OpenAI's stateful Responses flow and directly fixes the observed continuation failure.

Verification

GITHUB_BASE_REF=main node scripts/check-changeset.mjs
pnpm --filter @electric-ax/agents exec vitest run test/model-catalog.test.ts
pnpm --filter @electric-ax/agents typecheck

Also attempted the prep command's suggested Vitest thread flag:

pnpm --filter @electric-ax/agents exec vitest run test/model-catalog.test.ts --pool-options.threads.maxThreads=2

That failed because this Vitest version rejects the flag as an unknown option (--poolOptions). The same targeted test passes without that unsupported flag.

Files Changed

  • .changeset/openai-reasoning-store-true.md — patch changeset for @electric-ax/agents.
  • packages/agents/src/model-catalog.ts — forces store: true for built-in OpenAI reasoning payload defaults.
  • packages/agents/test/model-catalog.test.ts — updates expectations and adds regression coverage for OpenAI-only store: true behavior.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Electric Agents Desktop Builds

Build artifacts for commit b0bf1af.

Platform Status Artifact
macOS Apple Silicon Passed DMG
macOS Intel Passed DMG
Windows x64 Passed Installer
Linux x64 Passed AppImage / deb

Workflow run

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.13%. Comparing base (671a38f) to head (b0bf1af).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4557      +/-   ##
==========================================
- Coverage   74.35%   70.13%   -4.22%     
==========================================
  Files          52       79      +27     
  Lines        6897     9427    +2530     
  Branches     2190     2958     +768     
==========================================
+ Hits         5128     6612    +1484     
- Misses       1753     2798    +1045     
- Partials       16       17       +1     
Flag Coverage Δ
packages/agents 70.53% <ø> (?)
packages/agents-server 74.10% <ø> (-0.25%) ⬇️
packages/electric-ax 46.42% <ø> (?)
typescript 70.13% <ø> (-4.22%) ⬇️
unit-tests 70.13% <ø> (-4.22%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@KyleAMathews KyleAMathews force-pushed the fix/openai-responses-store-true branch from 903470f to 34bd59d Compare June 10, 2026 20:47
@KyleAMathews KyleAMathews force-pushed the fix/openai-responses-store-true branch from 1fa914b to b0bf1af Compare June 10, 2026 20:54
@balegas balegas merged commit 1219cb4 into main Jun 11, 2026
30 checks passed
@balegas balegas deleted the fix/openai-responses-store-true branch June 11, 2026 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants