fix(realtime): coalesce response.create across parallel tool calls#3405
Open
adityasingh2400 wants to merge 1 commit into
Open
fix(realtime): coalesce response.create across parallel tool calls#3405adityasingh2400 wants to merge 1 commit into
adityasingh2400 wants to merge 1 commit into
Conversation
When the Realtime model emits multiple function_call items in a single response, each completing tool task previously fired its own RealtimeModelSendToolOutput(start_response=True). The two response.create events race the API and the second one is rejected with conversation_already_has_active_response, so the model never speaks for the rest of the turn. Track tool calls per response_id and drive a single response.create from the last completing call (or directly from turn_ended if all tools finish before the response is done). Refs openai#1168
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the "no voice output" failure reported in #1168 where a Realtime turn that contains more than one
function_callitem completes without the model ever speaking. The root cause is a race inRealtimeSession: when_async_tool_calls=True(the default since #1984), each tool runs in its own background task and each one finishes withRealtimeModelSendToolOutput(start_response=True). The tworesponse.createevents the SDK forwards to the OpenAI Realtime API for the same turn collide, the second one comes back withconversation_already_has_active_response, and because that error carriesevent_id=Nonethe existing recovery path inopenai_realtime.pydoes not clear it. The user hears nothing until the next user turn, which matches the reporter's trace and the follow-up report on #1912.This change makes the session coalesce
response.createper modelresponse_id.RealtimeModelToolCallEventnow carries theresponse_idit was emitted under (propagated fromresponse.output_item.added/done), andRealtimeModelTurnEndedEventcarries theresponse_idfromresponse.done. The session tracks pending tool call ids per response: each tool output is sent withstart_response=Falsewhile other outputs are still in flight, the last completing tool flipsstart_response=Trueonceturn_endedhas been observed, and if every tool happened to finish beforeturn_ended, the session sends a singleresponse.createraw message itself. Sessions whose models still omitresponse_idkeep the historical "always start a response" behavior.Test plan
make formatmake lintmake typecheck(no new findings in changed files)uv run pytest tests/realtime/ -x(248 passed, including the newTestParallelToolCallCoalescingregression suite)