fix(realtime): coalesce response.create across parallel tool calls by adityasingh2400 · Pull Request #3405 · openai/openai-agents-python

adityasingh2400 · 2026-05-14T03:48:03Z

Summary

Fixes the "no voice output" failure reported in #1168 where a Realtime turn that contains more than one function_call item completes without the model ever speaking. The root cause is a race in RealtimeSession: when _async_tool_calls=True (the default since #1984), each tool runs in its own background task and each one finishes with RealtimeModelSendToolOutput(start_response=True). The two response.create events the SDK forwards to the OpenAI Realtime API for the same turn collide, the second one comes back with conversation_already_has_active_response, and because that error carries event_id=None the existing recovery path in openai_realtime.py does not clear it. The user hears nothing until the next user turn, which matches the reporter's trace and the follow-up report on #1912.

This change makes the session coalesce response.create per model response_id. RealtimeModelToolCallEvent now carries the response_id it was emitted under (propagated from response.output_item.added/done), and RealtimeModelTurnEndedEvent carries the response_id from response.done. The session tracks pending tool call ids per response: each tool output is sent with start_response=False while other outputs are still in flight, the last completing tool flips start_response=True once turn_ended has been observed, and if every tool happened to finish before turn_ended, the session sends a single response.create raw message itself. Sessions whose models still omit response_id keep the historical "always start a response" behavior.

Test plan

make format
make lint
make typecheck (no new findings in changed files)
uv run pytest tests/realtime/ -x (248 passed, including the new TestParallelToolCallCoalescing regression suite)

When the Realtime model emits multiple function_call items in a single response, each completing tool task previously fired its own RealtimeModelSendToolOutput(start_response=True). The two response.create events race the API and the second one is rejected with conversation_already_has_active_response, so the model never speaks for the rest of the turn. Track tool calls per response_id and drive a single response.create from the last completing call (or directly from turn_ended if all tools finish before the response is done). Refs openai#1168

seratch added the feature:realtime label May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(realtime): coalesce response.create across parallel tool calls#3405

fix(realtime): coalesce response.create across parallel tool calls#3405
adityasingh2400 wants to merge 1 commit into
openai:mainfrom
adityasingh2400:fix-realtime-parallel-tool-response

adityasingh2400 commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adityasingh2400 commented May 14, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants