Skip to content

fix(channels): classify empty-response, 4xx and vision chat errors with actionable copy (#3119)#3199

Merged
graycyrus merged 2 commits into
tinyhumansai:mainfrom
oxoxDev:fix/3119-classify-inference-errors
Jun 2, 2026
Merged

fix(channels): classify empty-response, 4xx and vision chat errors with actionable copy (#3119)#3199
graycyrus merged 2 commits into
tinyhumansai:mainfrom
oxoxDev:fix/3119-classify-inference-errors

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented Jun 2, 2026

Summary

  • Chat replies were dead-ending on the generic "Something went wrong. Please try again. This error has been reported." message for a whole family of real, individually-actionable failures.
  • Root cause: classify_inference_error (src/openhuman/channels/providers/web_errors.rs) is a first-match-wins substring chain whose final else returns that generic copy — any error matching no arm falls through with zero guidance.
  • Adds three classifier arms (empty-response, generic provider 4xx, vision-capability) so each cause gets an actionable, correctly-typed, correctly-retryable message instead of the blanket apology.

Problem

Self-hosted Sentry (tauri-rust, error_type:inference, live on the current release and staging) shows the top causes funnelling into the generic catch-all:

Cause Why it fell through
The model returned an empty response (dominant) no empty-response arm — string contains "model" but none of the model_unavailable keywords
Provider 4xx (generic 400 Bad Request, 404, 422) no generic 4xx arm
Vision-capability mismatch (image sent to a text-only model) no capability arm

All collapsed to one dead-end message, so users couldn't tell whether to retry, switch model, drop an attachment, or top up — and support absorbed the difference. Confirmed independently from a teammate report of the same string.

Solution

Three arms added to classify_inference_error, ordering chosen so none shadows (or is shadowed by) an existing arm:

  • empty_response — placed next to max_iterations (both are deterministic agent-state outcomes with a specific anchor). Matches the canonical "model returned an empty response" phrase (AgentError::EmptyProviderResponse, flattened to a String at the native-bus boundary). Retryable; message points at switching model / checking the local provider in Settings → AI → LLM.
  • provider_request_rejected — un-claimed provider 400/404/422. Ordered after the provider-config-rejection and model-unavailable arms so their more specific 4xx verdicts (invalid temperature, stale model pin, model-not-found, DeepSeek reasoning_content) still win. Non-retryable; the real upstream reason is surfaced via the existing secret-scrubbed, length-capped with_provider_detail.
  • capability_unsupported — image markers sent to a text-only model. Non-retryable; points at a vision-capable model.

Two predicates (is_empty_provider_response_text, is_provider_request_rejected_text) mirror the canonical observability/provider checks, matching the file's existing mirror convention (is_non_retryable_rate_limit_text).

This fixes the surface for #3092 / #3119 and most of #3104. The deeper DeepSeek thinking-mode reasoning_content round-trip bug (so those turns actually succeed, not just classify nicely) is tracked separately in #3197.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) — 7 new classify_inference_error_* tests incl. three ordering-locks (invalid-temperature 400, model-not-found 404, DeepSeek reasoning_content 400 all stay on their specific arms, not the new 4xx arm).
  • Diff coverage ≥ 80% — every new branch/arm has a dedicated assertion; changed lines are test-covered.
  • Coverage matrix updated — N/A: behaviour-only change (error-message classification), no feature row added/removed/renamed.
  • All affected feature IDs from the matrix are listed in ## Related — N/A: no matrix feature IDs affected.
  • No new external network dependencies introduced — pure in-process string classification; tests are offline.
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release-cut surface; the classifier is a pure function over an error string.
  • Linked issue closed via Closes #NNN in the ## Related section.

Impact

  • User-visible: chat failures now show actionable, cause-specific copy (retry / switch model / drop attachment / check provider) instead of the blanket apology. No change to the success path or the wire shape — error_type tokens are additive and the frontend keys its recovery UI off retryable/source, not a hardcoded error_type switch.
  • Runtime/security: none. No new code path executes on success; the new arms only re-route already-failing turns. The quoted provider detail reuses the existing sanitize_api_error + 300-char cap, so no new PII/secret surface.
  • Out of scope (documented, not silently capped): "error decoding response body" (transport/parse) and "unknown embedding provider: local" (embeddings misconfig) still route to the generic catch-all — left for a follow-up.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/3119-classify-inference-errors
  • Commit SHAs: 2cb095e6 (classifier arms + tests), cc14c9b3 (review fixes: envelope-scoped 4xx matcher + classify breadcrumb). Rebased onto upstream/main @ cac09cbc.
  • Authored by Claude; committed + GPG-signed under oxoxDev.

Validation Run

  • cargo fmt --all -- --check → clean.
  • Focused tests: cargo test --lib openhuman::channels::providers::web::tests::classify_inference_error29 passed, 0 failed.
  • Rust clippy on changed files → 0 warnings originating from web_errors.rs / web_tests.rs (pre-existing crate-wide clippy debt under blanket -D warnings is unrelated and untouched).
  • N/A: pnpm typecheck — no app/src change.
  • N/A: Tauri fmt/check — no app/src-tauri change (pre-push rust:check still run).

Validation Blocked

  • command: none
  • error: none
  • impact: none — pure deterministic string-classification logic; unit tests are the full proof, no app build/smoke required.

Behavior Changes

  • Intended behavior change: failed chat turns surface a specific, actionable error message + correct retryable/source/error_type instead of the generic apology.
  • User-visible effect: clearer chat errors; no change to successful turns.

Parity Contract

  • Legacy behavior preserved: all existing classify_inference_error_* verdicts unchanged (locked by the three ordering-lock tests + the full pre-existing suite staying green).
  • Guard/fallback/dispatch parity checks: the generic catch-all remains the final fallback; new arms only intercept errors that previously reached it.

Duplicate / Superseded PR Handling

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced error diagnostics with structured logging for better troubleshooting
    • Improved error classification for provider failures with actionable user guidance
    • Added specific handling for empty responses, unsupported capabilities, and request rejections with tailored recovery suggestions
  • Tests

    • Expanded test coverage for error classification scenarios and edge cases

@oxoxDev oxoxDev requested a review from a team June 2, 2026 09:00
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 82811c9e-d8c4-4bde-ad62-db6084ba85ae

📥 Commits

Reviewing files that changed from the base of the PR and between 9bf0611 and cc14c9b.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/web_errors.rs
  • src/openhuman/channels/providers/web_tests.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/channels/providers/web_tests.rs
  • src/openhuman/channels/providers/web_errors.rs

📝 Walkthrough

Walkthrough

Adds three typed inference error classifications (empty_response, capability_unsupported, provider_request_rejected) to classify_inference_error, introduces two helper predicates, captures and logs classification results, and adds tests including ordering-regression cases to preserve prior model_unavailable behavior.

Changes

Provider Error Classification Expansion

Layer / File(s) Summary
Classification core, logging, and empty-response test
src/openhuman/channels/providers/web_errors.rs, src/openhuman/channels/providers/web_tests.rs
Refactors classifier to assign result to classified and emit structured debug logs. Adds is_empty_provider_response_text and an early empty_response branch returning retryable empty_response with agent_loop source and actionable message. Test validates classification, retryability, source, and message content.
Capability unsupported & provider request rejected
src/openhuman/channels/providers/web_errors.rs, src/openhuman/channels/providers/web_tests.rs
Adds capability_unsupported for vision/capability-mismatch errors (non-retryable, source: "config") and provider_request_rejected for generic provider 4xx-style rejections (non-retryable, source: "provider"). Introduces is_provider_request_rejected_text predicate. Tests cover both classifications and message details.
Provider config rejection ordering locks
src/openhuman/channels/providers/web_tests.rs
Three regression tests ensuring DeepSeek reasoning-mode 400, invalid-temperature 400, and model-not-found 404 remain model_unavailable with Settings → LLM remediation and are not absorbed by the generic provider 4xx arm.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • graycyrus
  • senamakel

Poem

🐰 I sniffed the logs and found a clue,
Empty echoes now wear labels true.
Vision mismatches flagged with care,
Rejections named, no longer spare.
Hop, skip, classify — errors tamed anew.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: adding classification for empty-response, 4xx, and vision chat errors with actionable user-facing messages.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. working A PR that is being worked on by the team. bug labels Jun 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/openhuman/channels/providers/web_errors.rs (1)

344-367: ⚡ Quick win

Add structured debug/trace breadcrumbs on the new classification paths.

These new branches change user-facing error routing but currently emit no diagnostics. A lightweight, redacted tracing::debug!/trace! event with stable fields (error_type, source, provider, correlation id) would materially improve triage.

As per coding guidelines "In Rust, default to verbose diagnostics on new/changed flows using log/tracing at debug/trace levels with stable grep-friendly prefixes and correlation fields".

Also applies to: 527-568

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/channels/providers/web_errors.rs` around lines 344 - 367, The
new classification arm that returns a ClassifiedError for
is_empty_provider_response_text lacks any diagnostic breadcrumb; add a concise
tracing::debug! (or trace!) call adjacent to where the ClassifiedError is
constructed (the branch using is_empty_provider_response_text and the
ClassifiedError with error_type "empty_response") emitting stable fields
error_type="empty_response", source="agent_loop", provider=(Option<String>
redacted or None), and the request correlation id (pull from the available
context variable used for correlation), ensuring values are
redacted/stable-friendly and do not include full provider payloads; apply the
same pattern to the other new classification branches around the code that
constructs ClassifiedError (e.g., the arm near lines 527-568) so each
classification emits a debug-level breadcrumb with the same field names.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/channels/providers/web_errors.rs`:
- Around line 344-367: The new classification arm that returns a ClassifiedError
for is_empty_provider_response_text lacks any diagnostic breadcrumb; add a
concise tracing::debug! (or trace!) call adjacent to where the ClassifiedError
is constructed (the branch using is_empty_provider_response_text and the
ClassifiedError with error_type "empty_response") emitting stable fields
error_type="empty_response", source="agent_loop", provider=(Option<String>
redacted or None), and the request correlation id (pull from the available
context variable used for correlation), ensuring values are
redacted/stable-friendly and do not include full provider payloads; apply the
same pattern to the other new classification branches around the code that
constructs ClassifiedError (e.g., the arm near lines 527-568) so each
classification emits a debug-level breadcrumb with the same field names.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bf1df762-e9b7-4493-af18-c333bf0c8d23

📥 Commits

Reviewing files that changed from the base of the PR and between 1e03dd2 and 34bf92d.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/web_errors.rs
  • src/openhuman/channels/providers/web_tests.rs

Comment thread src/openhuman/channels/providers/web_errors.rs
oxoxDev added a commit to oxoxDev/openhuman that referenced this pull request Jun 2, 2026
…y breadcrumb (tinyhumansai#3199 review)

Address CodeRabbit review on PR tinyhumansai#3199:
- is_provider_request_rejected_text now matches only inside the provider
  error envelope ("<provider> API error (4xx …)") instead of bare 400/404/
  422 substrings, so unrelated errors carrying those digits (token counts,
  byte offsets, timestamps) can't be misclassified as request rejections.
- Add a single redacted log::debug! breadcrumb at the end of
  classify_inference_error covering every arm (error_type/source/retryable/
  provider) per the verbose-diagnostics guideline. Raw error text is left to
  the caller's existing warn-level log to avoid duplicating provider payload.
@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented Jun 2, 2026

Re: the diagnostics nitpick (new classification arms had no breadcrumb) — addressed in 9bf0611. Added a single redacted log::debug! at the end of classify_inference_error covering every arm (error_type / source / retryable / provider) rather than scattering per-arm logs. Raw err is intentionally not logged there — the caller (web.rs::run_chat_task) already records it at warn level + routes it through report_error_or_expected, so this avoids duplicating provider payload.

@oxoxDev oxoxDev requested a review from graycyrus June 2, 2026 09:21
oxoxDev added 2 commits June 2, 2026 16:49
…tionable copy (tinyhumansai#3119)

The chat error classifier dead-ended every unrecognized provider/agent
failure on the generic 'Something went wrong' message, leaving users
unable to self-remediate. Add three arms to classify_inference_error:

- empty_response: the dominant cause (model returns an empty completion;
  AgentError::EmptyProviderResponse flattened to a String). Retryable,
  with a switch-model remedy. Placed next to max_iterations.
- provider_request_rejected: un-claimed provider 4xx (400/404/422), with
  the scrubbed upstream reason quoted. Ordered after the config-rejection
  and model-unavailable arms so their specific verdicts win first.
- capability_unsupported: image sent to a text-only model; non-retryable,
  points at a vision-capable model.

Two predicates mirror the canonical observability/provider checks. Adds
7 tests including three ordering-locks proving the new 4xx arm does not
steal invalid-temperature, model-not-found, or DeepSeek reasoning_content
cases already claimed by earlier arms.
…y breadcrumb (tinyhumansai#3199 review)

Address CodeRabbit review on PR tinyhumansai#3199:
- is_provider_request_rejected_text now matches only inside the provider
  error envelope ("<provider> API error (4xx …)") instead of bare 400/404/
  422 substrings, so unrelated errors carrying those digits (token counts,
  byte offsets, timestamps) can't be misclassified as request rejections.
- Add a single redacted log::debug! breadcrumb at the end of
  classify_inference_error covering every arm (error_type/source/retryable/
  provider) per the verbose-diagnostics guideline. Raw error text is left to
  the caller's existing warn-level log to avoid duplicating provider payload.
@oxoxDev oxoxDev force-pushed the fix/3119-classify-inference-errors branch from 9bf0611 to cc14c9b Compare June 2, 2026 11:20
@graycyrus graycyrus merged commit 89380db into tinyhumansai:main Jun 2, 2026
23 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. bug working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Some users receive "Something went wrong" error message during normal chat interactions

2 participants