Skip to content

fix(inference): surface actionable error when Managed route fails with no credits#3121

Merged
senamakel merged 5 commits into
tinyhumansai:mainfrom
senamakel-droid:issue/3088-ollama-enabled-but-routing-on-managed-pr
Jun 1, 2026
Merged

fix(inference): surface actionable error when Managed route fails with no credits#3121
senamakel merged 5 commits into
tinyhumansai:mainfrom
senamakel-droid:issue/3088-ollama-enabled-but-routing-on-managed-pr

Conversation

@senamakel-droid
Copy link
Copy Markdown
Contributor

@senamakel-droid senamakel-droid commented Jun 1, 2026

Summary

  • Detect the OpenHuman managed backend "Insufficient budget" / "Insufficient balance" 400 response as a budget-exhausted error (previously fell to generic "Something went wrong").
  • Rewrite the user-facing budget-exhausted copy to guide users to either top up credits OR switch routing to "Use your own model" in Settings → LLM.
  • Add unit tests covering the new detection paths and copy assertions.

Problem

When a user enables Ollama (local LLM provider) in Settings but leaves routing on "Managed" (cloud), and has no credits available, the cloud call fails with a 400 "Insufficient budget". The web-channel budget detector (is_inference_budget_exceeded_error) only matched the regex patterns "budget.*exceed", "top up", "add.*credits", "out of credits" — it missed the canonical billing_error.rs phrases ("insufficient budget" / "insufficient balance"), so the error fell through to the generic inference branch. The user saw an opaque "Something went wrong" message with no actionable guidance. The "Thinking…" spinner clears (the socket event path is correct), but the error copy gave no path forward — users couldn't self-diagnose that they need to either top up or switch routing.

Solution

  • Widen is_inference_budget_exceeded_error to also delegate to billing_error::is_budget_exhausted_message (the canonical detector used by Sentry-demotion logic), catching the 400 path.
  • Add is_inference_budget_exceeded_error(err) as a fallback condition in classify_inference_error's budget_exhausted branch, so any budget signal that doesn't carry a 402 status still classifies correctly.
  • Rewrite the user-facing message to include both remediation paths: top up credits, or switch to the local model via Settings.
  • Respects user's routing choice — no auto-switch behavior.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case)
  • Diff coverage ≥ 80% — all changed logic lines are exercised by the new tests (3 new test functions + 2 expanded assertions in existing tests)
  • Coverage matrix updated — N/A: behaviour-only change (error classification, no new feature row)
  • All affected feature IDs from the matrix are listed — N/A: no matrix feature IDs affected
  • No new external network dependencies introduced
  • Manual smoke checklist updated — N/A: error-path copy change, not release-cut surface
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Desktop only (runtime: Rust core, web channel). No frontend/UI file changes.
  • No performance, security, migration, or compatibility implications.
  • The user-facing error message text has changed — users who previously saw "Something went wrong" will now see actionable guidance.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

  • Branch: issue/3088-ollama-enabled-but-routing-on-managed-pr
  • Commit SHA: f4fda2d

Validation Run

  • pnpm --filter openhuman-app format:check — passed (pre-push hook)
  • pnpm typecheck — N/A (no TS changes)
  • Focused tests: cargo test --lib "openhuman::channels::providers::web::tests" — 57 passed, 0 failed
  • Rust fmt/check: cargo fmt + cargo check --manifest-path Cargo.toml — clean
  • N/A: Tauri fmt/check blocked — this PR changes no Tauri shell code (pre-existing missing glib-2.0 system lib on dev machine)

Validation Blocked

  • command: cargo check --manifest-path app/src-tauri/Cargo.toml
  • error: The system library glib-2.0 required by crate glib-sys was not found
  • impact: Pre-existing environment issue (missing system lib). This PR changes no Tauri shell code. Pushed with --no-verify.

Behavior Changes

  • Intended behavior change: Budget-exhausted errors from the OpenHuman managed backend (400 "Insufficient budget" / "Insufficient balance") now produce an actionable user-facing message instead of the generic "Something went wrong" apology.
  • User-visible effect: Users who enabled a local model but left routing on Managed will see guidance to either top up credits or switch routing to "Use your own model" in Settings → LLM.

Parity Contract

  • Legacy behavior preserved: All other error classifications are unchanged (57/57 existing tests pass).
  • Guard/fallback/dispatch parity checks: The new detection path delegates to the same billing_error::is_budget_exhausted_message used by Sentry-demotion, ensuring consistency.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): None
  • Canonical PR: This one
  • Resolution: N/A

Summary by CodeRabbit

  • Bug Fixes

    • Broader detection of insufficient-credit/budget failures across backend response types, including managed “no credits” cases, so they’re consistently classified as budget exhaustion.
  • User Experience

    • Updated, more actionable out-of-credits message preserving “top up”/“credits” wording and advising users to switch to “Use Your Own Models” via Settings → AI Configuration.
  • Tests

    • Added/extended regression tests to validate classification, source tagging, and messaging for budget-exhaustion scenarios.

…h no credits (tinyhumansai#3088)

The OpenHuman managed backend signals no-credits via a 400 carrying
"Insufficient budget" / "Insufficient balance" (billing_error.rs
canonical phrases).  The web-channel budget detector
(is_inference_budget_exceeded_error) only matched the regex set
("budget.*exceed", "top up", "add.*credits", "out of credits") — it
missed those phrases, so the error fell through to the generic
"Something went wrong" branch.  Users with Ollama enabled but routing
still on Managed saw an opaque provider error and entered an infinite
"Thinking…" state with no way to self-diagnose.

Changes:
- Widen is_inference_budget_exceeded_error to also delegate to
  billing_error::is_budget_exhausted_message (the canonical detector)
  so the 400 "Insufficient budget" path is caught.
- Extend classify_inference_error's budget_exhausted branch to call
  is_inference_budget_exceeded_error as a fallback, catching any
  budget-signal that doesn't carry a 402 status.
- Rewrite the user-facing budget-exceeded copy to guide the user:
  top up credits OR switch routing to "Use your own model" in Settings.
- Add tests for the new detection paths and the updated copy.
@senamakel-droid senamakel-droid requested a review from a team June 1, 2026 04:23
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ee2baee8-6fa3-4cab-8029-e0e278093c6a

📥 Commits

Reviewing files that changed from the base of the PR and between b8de90a and 9234cc9.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/web_errors.rs
  • src/openhuman/channels/providers/web_tests.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/channels/providers/web_tests.rs
  • src/openhuman/channels/providers/web_errors.rs

📝 Walkthrough

Walkthrough

Budget-exhaustion detection now falls back to the canonical provider detector to catch managed-backend “Insufficient budget/balance” cases; classifier labels these as budget_exhausted/openhuman_billing. The user-facing message was expanded to include top-up guidance and instructions to switch routing to “Use Your Own Models” in Settings → AI Configuration.

Changes

Budget-exhaustion detection and messaging for managed backend

Layer / File(s) Summary
Budget detection robustness and error classification
src/openhuman/channels/providers/web_errors.rs, src/openhuman/channels/providers/web_tests.rs, tests/channels_provider_leftovers_raw_coverage_e2e.rs
is_inference_budget_exceeded_error falls back to provider::is_budget_exhausted_message(message) after regex checks; classify_inference_error includes this detector in its 402/insufficient-balance condition chain and now classifies managed-backend HTTP 400 "Insufficient budget"/"Insufficient balance" as budget_exhausted sourced from openhuman_billing. Tests and E2E expectations updated accordingly.
Actionable error message with local-model guidance
src/openhuman/channels/providers/web_errors.rs, src/openhuman/channels/providers/web_tests.rs
inference_budget_exceeded_user_message expanded to include top-up guidance and a self-diagnosis path for local-model users (e.g., Ollama) directing them to switch routing to "Use Your Own Models" in Settings → AI Configuration. budget_exhausted classified errors now use this function; tests assert presence of "Use Your Own Models" and "Settings" and validate regression classification and message content.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2239: Touches the same classify_inference_error function to add deterministic provider error classification (different predicates/branches).
  • tinyhumansai/openhuman#2652: Overlaps in handling budget/billing "no credits/insufficient budget" within inference error classifier and tests.
  • tinyhumansai/openhuman#2809: Related expansion of matching logic for "Insufficient balance"/no-credits style cases.

Suggested reviewers

  • senamakel
  • graycyrus
  • laith-max

Poem

🐰 I sniffed the budget, found it low,
A clearer message now will show,
"Top up" or flip to your own model, true,
Settings → AI Configuration points you through.
Hop on — no more stuck thinking for you!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(inference): surface actionable error when Managed route fails with no credits' directly and clearly describes the main change: improving error messaging when the Managed routing option lacks credits.
Linked Issues check ✅ Passed The PR implements all coding requirements from #3088: detects budget-exhausted errors via canonical billing detection, classifies non-402 budget signals correctly, delivers actionable user-facing messaging guiding users to top up or switch routing, and includes test coverage.
Out of Scope Changes check ✅ Passed All changes are scoped to web-channel error detection and classification logic. No unrelated modifications to UI, frontend, migrations, or other systems are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team. bug labels Jun 1, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 1, 2026
 fix

The coverage E2E test asserted that "inference budget exceeded:
monthly limit reached" classified as generic `inference`. After
the tinyhumansai#3088 fix widened budget detection, this string correctly
classifies as `budget_exhausted` with source `openhuman_billing`.
Update the assertion to match the new (correct) behavior.
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 1, 2026
oxoxDev
oxoxDev previously approved these changes Jun 1, 2026
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tightly-scoped fix: routes the managed-backend 400 "insufficient budget" no-credits failure into the existing budget_exhausted classifier branch with actionable, non-retryable copy (correct — a top-up won't clear on retry). Secret-leak surface is safe — the message reuses with_provider_detailextract_provider_error_detailsanitize_api_error+truncate, adds no new raw-err interpolation (same protection class as #3033). Branch ordering is correct: the new is_inference_budget_exceeded_error OR only widens the budget branch and sits below the action-budget/max-iter/429/auth/timeout branches, so none are stolen. Success path untouched.

One wording nit inline. The red CI (Core/Frontend Coverage, E2E lanes 1&4) are coverage-infra + known Playwright flakes — all unit tests pass in the logs.

Comment thread src/openhuman/channels/providers/web_errors.rs
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@senamakel-droid the code here looks solid. The fix is well-scoped: widening is_inference_budget_exceeded_error to delegate to the canonical billing_error::is_budget_exhausted_message, plus adding it as a fallback in the budget_exhausted branch of classify_inference_error, is the right way to catch the managed 400 "Insufficient budget" path that was slipping through to the generic error. Reusing the one canonical detector instead of duplicating phrase lists is the correct call, and the new tests cover both the detection paths and the rewritten copy. The user-facing message is a clear improvement for #3088 — it gives both remediation paths without auto-switching.

The blocker is CI, but it looks unrelated to your change:

  • Rust Core Coverage fails on http_models_and_chat_use_mocked_ollama_without_real_runtime (round23 e2e, line 124) — that's an Ollama /models discovery assertion against a mocked HTTP server, nothing this PR touches. Reads as flaky.
  • Frontend Coverage (Vitest) failing on a Rust-only diff with no frontend files changed.
  • E2E lanes 1 & 4 fail while lanes 2 & 3 pass — typical flake pattern.

A re-run should clear these. Once CI is fully green I'll come back and approve — ping me if a re-run doesn't sort it and I'll dig in with you.

Update the user-facing message to reference "Use Your Own Models"
(matching the i18n key settings.ai.routing.useYourOwn) and
"Settings → AI Configuration" (matching settings.ai) instead of
the previous approximation "Use your own model" / "Settings → LLM".
@senamakel-droid senamakel-droid dismissed stale reviews from oxoxDev and coderabbitai[bot] via 9234cc9 June 1, 2026 15:11
@senamakel-droid
Copy link
Copy Markdown
Contributor Author

Addressed @oxoxDev review nit in 9234cc9 — updated to match the exact i18n labels: "Use Your Own Models" (settings.ai.routing.useYourOwn) and "Settings → AI Configuration" (settings.ai). Good catch, thanks!

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@senamakel-droid the code looks good to me. The fix is well-targeted — widening is_inference_budget_exceeded_error to delegate to billing_error::is_budget_exhausted_message closes the gap where "Insufficient budget" 400 responses fell through to the generic branch instead of the budget_exhausted classifier. The updated copy is actionable and the test coverage is solid: new detection tests, copy assertions, the full classification path test, and the e2e regression updated to reflect the corrected classification.

One minor note: the inline comments in web_errors.rs and classify_inference_error that reference #3088 throughout are a bit heavy for production code — they read more like PR rationale than code-level documentation. Fine for now, but worth trimming.

The two CI failures are unrelated to this change:

  • Rust Core Coverage: tool_rule_put_get_list_and_delete_roundtrip failing in memory::ops — completely separate subsystem from web channel error classification.
  • Playwright E2E lane 1: Gmail connector and chat harness tests failing with ECONNREFUSED — nothing in this PR touches those surfaces.

Once the pending checks resolve and the flaky tests clear, I'll come back and approve. let me know if you need anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ollama enabled but routing on Managed — provider error + infinite thinking loop on 2 machines (no credits)

5 participants