feat: structured error handling in Responses API streaming by iamemilio · Pull Request #4942 · llamastack/llama-stack

iamemilio · 2026-02-17T22:00:53Z

What this does

When a streaming Responses API call fails mid-stream (e.g., the provider rejects an image or hits a rate limit), Llama Stack now returns spec-compliant error codes in the response.failed event instead of generic ones.

Before this PR

All streaming errors produced one of two hard-coded codes:

internal_error with str(exception) as the message — which leaked Python tracebacks and internal details to the client
invalid_request_error for unsupported truncation — a code that doesn't exist in the Responses API spec

This meant OpenAI-compatible clients couldn't programmatically distinguish between different failure modes (bad image, rate limit, server issue), and raw exception strings could leak implementation details.

After this PR

Provider errors are mapped to spec codes. When the upstream inference provider returns a structured error (e.g., OpenAI returns invalid_base64), we extract it and map it to the correct Responses API code (invalid_base64_image). Only codes defined in the spec are emitted; anything unrecognized falls back to server_error.
No more internal details leaked. Unexpected exceptions now return a generic server_error with a safe message instead of str(exc).
Truncation error uses a valid code. invalid_request_error → server_error (which is an actual spec code).

User-facing impact

Clients using the OpenAI SDK or any spec-compliant streaming consumer will now receive meaningful, actionable error codes in response.failed events — e.g., invalid_base64_image instead of internal_error. This lets applications handle different failure modes appropriately (retry on server_error, show a user message on invalid_base64_image, back off on rate_limit_exceeded) without parsing error message strings.

Depends on

feat: Use Structured Errors in Responses and Conversations API #4879

Test plan

3 unit tests for extract_openai_error(): unknown codes fall back to server_error, valid codes pass through, all spec codes are recognized
8 existing unit tests for error body parsing (nested, direct, missing fields, non-dict bodies, etc.)
2 integration tests with recorded gpt and ollama responses:
- truncation="auto" → validates response.failed event has a valid error code
- Invalid base64 image input → validates provider BadRequestError is mapped to a spec-compliant code in the response.failed event
StreamingValidator enhancement: all integration tests now assert that response.failed error codes are within the spec-defined set
All pre-commit hooks pass (ruff, mypy, etc.)

mergify · 2026-02-18T19:00:01Z

This pull request has merge conflicts that must be resolved before it can be merged. @iamemilio please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

- Add `_VALID_RESPONSE_ERROR_CODES` allowlist and validate error codes in `extract_openai_error()`, falling back to `server_error` for unmapped codes - Map provider Chat Completions error codes to Responses API codes (e.g. `invalid_base64` -> `invalid_base64_image`) - Use `server_error` instead of `invalid_request_error` for unsupported truncation mode - Enhance `StreamingValidator` to assert error codes are spec-compliant - Add integration tests with gpt and ollama recordings for streaming failures (truncation=auto, invalid base64 image) - Add unit tests for error code extraction, mapping, and validation Co-authored-by: Cursor <cursoragent@cursor.com>

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 17, 2026

mergify bot added the needs-rebase label Feb 18, 2026

iamemilio force-pushed the responses-streaming-errors branch from a6ff9fa to 0c83365 Compare February 25, 2026 18:55

mergify bot removed the needs-rebase label Feb 25, 2026

iamemilio force-pushed the responses-streaming-errors branch 2 times, most recently from 3e94d51 to f8cacf6 Compare February 25, 2026 19:10

iamemilio force-pushed the responses-streaming-errors branch from f8cacf6 to 672ffa6 Compare February 25, 2026 19:24

iamemilio marked this pull request as ready for review February 25, 2026 19:29

iamemilio requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners February 25, 2026 19:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: structured error handling in Responses API streaming#4942

feat: structured error handling in Responses API streaming#4942
iamemilio wants to merge 1 commit intollamastack:mainfrom
iamemilio:responses-streaming-errors

iamemilio commented Feb 17, 2026 •

edited

Loading

Uh oh!

mergify bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iamemilio commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

Before this PR

After this PR

User-facing impact

Depends on

Test plan

Uh oh!

mergify bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

iamemilio commented Feb 17, 2026 •

edited

Loading