fix(inference): surface actionable error when Managed route fails with no credits#3121
Conversation
…h no credits (tinyhumansai#3088) The OpenHuman managed backend signals no-credits via a 400 carrying "Insufficient budget" / "Insufficient balance" (billing_error.rs canonical phrases). The web-channel budget detector (is_inference_budget_exceeded_error) only matched the regex set ("budget.*exceed", "top up", "add.*credits", "out of credits") — it missed those phrases, so the error fell through to the generic "Something went wrong" branch. Users with Ollama enabled but routing still on Managed saw an opaque provider error and entered an infinite "Thinking…" state with no way to self-diagnose. Changes: - Widen is_inference_budget_exceeded_error to also delegate to billing_error::is_budget_exhausted_message (the canonical detector) so the 400 "Insufficient budget" path is caught. - Extend classify_inference_error's budget_exhausted branch to call is_inference_budget_exceeded_error as a fallback, catching any budget-signal that doesn't carry a 402 status. - Rewrite the user-facing budget-exceeded copy to guide the user: top up credits OR switch routing to "Use your own model" in Settings. - Add tests for the new detection paths and the updated copy.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughBudget-exhaustion detection now falls back to the canonical provider detector to catch managed-backend “Insufficient budget/balance” cases; classifier labels these as ChangesBudget-exhaustion detection and messaging for managed backend
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
fix The coverage E2E test asserted that "inference budget exceeded: monthly limit reached" classified as generic `inference`. After the tinyhumansai#3088 fix widened budget detection, this string correctly classifies as `budget_exhausted` with source `openhuman_billing`. Update the assertion to match the new (correct) behavior.
oxoxDev
left a comment
There was a problem hiding this comment.
LGTM. Tightly-scoped fix: routes the managed-backend 400 "insufficient budget" no-credits failure into the existing budget_exhausted classifier branch with actionable, non-retryable copy (correct — a top-up won't clear on retry). Secret-leak surface is safe — the message reuses with_provider_detail→extract_provider_error_detail→sanitize_api_error+truncate, adds no new raw-err interpolation (same protection class as #3033). Branch ordering is correct: the new is_inference_budget_exceeded_error OR only widens the budget branch and sits below the action-budget/max-iter/429/auth/timeout branches, so none are stolen. Success path untouched.
One wording nit inline. The red CI (Core/Frontend Coverage, E2E lanes 1&4) are coverage-infra + known Playwright flakes — all unit tests pass in the logs.
sanil-23
left a comment
There was a problem hiding this comment.
@senamakel-droid the code here looks solid. The fix is well-scoped: widening is_inference_budget_exceeded_error to delegate to the canonical billing_error::is_budget_exhausted_message, plus adding it as a fallback in the budget_exhausted branch of classify_inference_error, is the right way to catch the managed 400 "Insufficient budget" path that was slipping through to the generic error. Reusing the one canonical detector instead of duplicating phrase lists is the correct call, and the new tests cover both the detection paths and the rewritten copy. The user-facing message is a clear improvement for #3088 — it gives both remediation paths without auto-switching.
The blocker is CI, but it looks unrelated to your change:
- Rust Core Coverage fails on
http_models_and_chat_use_mocked_ollama_without_real_runtime(round23 e2e, line 124) — that's an Ollama/modelsdiscovery assertion against a mocked HTTP server, nothing this PR touches. Reads as flaky. - Frontend Coverage (Vitest) failing on a Rust-only diff with no frontend files changed.
- E2E lanes 1 & 4 fail while lanes 2 & 3 pass — typical flake pattern.
A re-run should clear these. Once CI is fully green I'll come back and approve — ping me if a re-run doesn't sort it and I'll dig in with you.
…nabled-but-routing-on-managed-pr
Update the user-facing message to reference "Use Your Own Models" (matching the i18n key settings.ai.routing.useYourOwn) and "Settings → AI Configuration" (matching settings.ai) instead of the previous approximation "Use your own model" / "Settings → LLM".
9234cc9
graycyrus
left a comment
There was a problem hiding this comment.
@senamakel-droid the code looks good to me. The fix is well-targeted — widening is_inference_budget_exceeded_error to delegate to billing_error::is_budget_exhausted_message closes the gap where "Insufficient budget" 400 responses fell through to the generic branch instead of the budget_exhausted classifier. The updated copy is actionable and the test coverage is solid: new detection tests, copy assertions, the full classification path test, and the e2e regression updated to reflect the corrected classification.
One minor note: the inline comments in web_errors.rs and classify_inference_error that reference #3088 throughout are a bit heavy for production code — they read more like PR rationale than code-level documentation. Fine for now, but worth trimming.
The two CI failures are unrelated to this change:
- Rust Core Coverage:
tool_rule_put_get_list_and_delete_roundtripfailing inmemory::ops— completely separate subsystem from web channel error classification. - Playwright E2E lane 1: Gmail connector and chat harness tests failing with ECONNREFUSED — nothing in this PR touches those surfaces.
Once the pending checks resolve and the flaky tests clear, I'll come back and approve. let me know if you need anything.
…h no credits (tinyhumansai#3121) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Summary
Problem
When a user enables Ollama (local LLM provider) in Settings but leaves routing on "Managed" (cloud), and has no credits available, the cloud call fails with a 400 "Insufficient budget". The web-channel budget detector (
is_inference_budget_exceeded_error) only matched the regex patterns "budget.*exceed", "top up", "add.*credits", "out of credits" — it missed the canonicalbilling_error.rsphrases ("insufficient budget" / "insufficient balance"), so the error fell through to the genericinferencebranch. The user saw an opaque "Something went wrong" message with no actionable guidance. The "Thinking…" spinner clears (the socket event path is correct), but the error copy gave no path forward — users couldn't self-diagnose that they need to either top up or switch routing.Solution
is_inference_budget_exceeded_errorto also delegate tobilling_error::is_budget_exhausted_message(the canonical detector used by Sentry-demotion logic), catching the 400 path.is_inference_budget_exceeded_error(err)as a fallback condition inclassify_inference_error'sbudget_exhaustedbranch, so any budget signal that doesn't carry a 402 status still classifies correctly.Submission Checklist
Closes #NNNin the## RelatedsectionImpact
Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
issue/3088-ollama-enabled-but-routing-on-managed-prValidation Run
pnpm --filter openhuman-app format:check— passed (pre-push hook)pnpm typecheck— N/A (no TS changes)cargo test --lib "openhuman::channels::providers::web::tests"— 57 passed, 0 failedcargo fmt+cargo check --manifest-path Cargo.toml— cleanValidation Blocked
command:cargo check --manifest-path app/src-tauri/Cargo.tomlerror:The system library glib-2.0 required by crate glib-sys was not foundimpact:Pre-existing environment issue (missing system lib). This PR changes no Tauri shell code. Pushed with--no-verify.Behavior Changes
Parity Contract
billing_error::is_budget_exhausted_messageused by Sentry-demotion, ensuring consistency.Duplicate / Superseded PR Handling
Summary by CodeRabbit
Bug Fixes
User Experience
Tests