fix(integrations): type backend user-state errors so tool runner skips Sentry (TAURI-RUST-5KG)#3318
Conversation
…s Sentry (TAURI-RUST-5KG) `web_search_tool` flooded Sentry with 1860 events / 9 users for backend 400 "Insufficient balance" — the agent retried 19 times in a single turn, and each `Ok(Err(e))` in `agent::harness::engine::tools::run_one_tool` called the always-capture `report_error`. The integrations breadcrumb classifier already knew "Insufficient balance" was a user-state error (see tinyhumansai#2809 / TAURI-RUST-4ZF), but that distinction was lost at the `?` boundary because `IntegrationClient::{post,get}` flattened every non-2xx into an opaque `anyhow::Error`. Fix it at the architectural boundary that knows the difference: - Add `BackendUserStateError` (Display + std::error::Error) and the `is_backend_user_state_error(&anyhow::Error)` helper to the integrations client. Wrap the four bail sites (post/get × non-2xx / envelope-failure) with the typed marker when `expected_error_kind` classifies the message as `BackendUserError`, `BudgetExhausted`, or `ProviderUserState`. Display string is preserved verbatim, so the ~30 existing callers that stringify the error see no behaviour change. - In `agent::harness::engine::tools::run_one_tool`, downcast the bubbled error: when the typed marker is present, log a warn breadcrumb (no Sentry capture) and surface the message to the LLM via the existing `(text, false)` failure tuple. The circuit breaker still records the failure so 19 retries with the same user-state message trip out. Real failures (5xx, transport bugs, unknown envelope shapes) keep their `report_error` Sentry path unchanged. Scope: every tool that calls the integrations backend benefits — search/parallel/tinyfish, all composio, twilio, apify, google_places, stock_prices, web3, linkedin_enrichment — not just `web_search_tool`. Tests: 7 new cases in `client_tests.rs` pin both halves of the contract: typed wrapping fires for 400/insufficient-balance, 403/toolkit-not-enabled, and 2xx-envelope user-state failures; 500s remain un-typed; Display verbatim; `is_backend_user_state_error` walks the chain so `.context()` wraps don't silently re-enable capture.
📝 WalkthroughWalkthroughThis PR introduces typed backend user-state error classification to route specific integration failures (insufficient balance, disabled toolkits) away from observability reporting. The HTTP client detects these errors via status codes and response patterns, wraps them in a ChangesBackend user-state error classification and routing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/integrations/client_tests.rs (1)
500-531: ⚡ Quick winAdd the missing
getenvelope regression case.The typed boundary is pinned for
postnon-2xx,postenvelope, andgetnon-2xx, but not forIntegrationClient::get's200 + success:falsebranch. That leaves one of the four new return sites inclient.rsunguarded against drift.🧪 Suggested test shape
+#[tokio::test] +async fn get_envelope_user_state_failure_returns_typed_backend_user_state_error() { + let app = Router::new().route( + "/agent-integrations/composio/connections", + get(|| async { + ( + StatusCode::OK, + Json(json!({ + "success": false, + "error": "Toolkit \"slack\" is not enabled" + })), + ) + .into_response() + }), + ); + let base = start_mock_backend(app).await; + let client = client_for(base); + let err = client + .get::<serde_json::Value>("/agent-integrations/composio/connections") + .await + .expect_err("envelope-failure must surface as Err"); + + assert!( + is_backend_user_state_error(&err), + "GET envelope user-state failure must carry BackendUserStateError marker; got: {err:#}" + ); +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/integrations/client_tests.rs` around lines 500 - 531, Add a mirror regression test for the GET envelope case: create a new async #[tokio::test] (e.g., get_envelope_user_state_failure_returns_typed_backend_user_state_error) that starts a mock Router route for "/agent-integrations/composio/execute" which returns StatusCode::OK with Json success: false and the same error message, then call client.get(...) via IntegrationClient::get (client.get) and assert the returned Err carries the BackendUserStateError marker using is_backend_user_state_error(&err); this covers the missing 200+success:false branch for IntegrationClient::get to match the existing post_envelope_user_state_failure_returns_typed_backend_user_state_error test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/integrations/client_tests.rs`:
- Around line 500-531: Add a mirror regression test for the GET envelope case:
create a new async #[tokio::test] (e.g.,
get_envelope_user_state_failure_returns_typed_backend_user_state_error) that
starts a mock Router route for "/agent-integrations/composio/execute" which
returns StatusCode::OK with Json success: false and the same error message, then
call client.get(...) via IntegrationClient::get (client.get) and assert the
returned Err carries the BackendUserStateError marker using
is_backend_user_state_error(&err); this covers the missing 200+success:false
branch for IntegrationClient::get to match the existing
post_envelope_user_state_failure_returns_typed_backend_user_state_error test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d531ec68-9042-4745-acef-c61e323614ca
📒 Files selected for processing (4)
src/openhuman/agent/harness/engine/tools.rssrc/openhuman/integrations/client.rssrc/openhuman/integrations/client_tests.rssrc/openhuman/integrations/mod.rs
|
No this is not the way we fix bugs. |
Summary
web_search_toolflooded Sentry with 1860 events / 9 users (TAURI-RUST-5KG) when the backend returned400 Insufficient balancefrom/agent-integrations/parallel/search. The agent retried 19× per turn and every retry was captured as a distinct error.Insufficient balancewas a user-state error (fix(observability): classify DeepSeek 'Insufficient Balance' 402 as user-state (Sentry TAURI-RUST-4ZF) #2809 / TAURI-RUST-4ZF), but the distinction was thrown away whenIntegrationClient::{post,get}flattened every non-2xx into an opaqueanyhow::Error— by the time the failure reachedagent::harness::engine::tools::run_one_tool, the always-capturereport_errorpath ran.web_search_tool.Problem
Sentry TAURI-RUST-5KG —
tool=web_search_tool,operation=execute,iteration=19, Windows, releases0.56.0→0.57.13. The classifier added by #2809 already recognisedInsufficient balanceasBackendUserError/BudgetExhausted, and theintegrations.postbreadcrumb correctly demoted to a warn — but the?-propagated error bubbled into the per-tool error handler atsrc/openhuman/agent/harness/engine/tools.rs:308, which used the always-capturereport_error(notreport_error_or_expected). One out-of-credits user → one Sentry event per retry → 19 events per turn.The architectural smell is that the HTTP client already classified the failure at the breadcrumb site, then immediately discarded the classification when bailing. Routing the message-string back through
expected_error_kindat every call site is exactly the bandaid that lets this regress.Solution
Type the error at the boundary that knows the difference.
src/openhuman/integrations/client.rsBackendUserStateError(Display +std::error::Error) andis_backend_user_state_error(&anyhow::Error)helper. The struct'sDisplayimpl is the underlying bail message verbatim, so wrapping is transparent to callers that only stringify the error.classify_as_user_state(&str) -> Option<anyhow::Error>— runsexpected_error_kindand wraps withBackendUserStateErrorwhen the kind isBackendUserError,BudgetExhausted, orProviderUserState(the three buckets that represent clean user-actionable failures). Other expected kinds (NetworkUnreachable,SessionExpired, …) stay plain so existing retry / re-auth machinery is untouched.anyhow::bail!otherwise.src/openhuman/integrations/mod.rsBackendUserStateError+is_backend_user_state_error.src/openhuman/agent/harness/engine/tools.rsOk(Err(e))arm ofrun_one_tool, downcast viais_backend_user_state_error(&e). When matched:tracing::warn!(no Sentry capture), surface the message to the LLM via the existing(text, false)failure tuple. The repeated-failure circuit breaker still records the failure so identical retries trip out. When not matched: existingreport_errorpath runs unchanged.Design notes
format!("{e:#}")(UI toasts, the four otherreport_error_or_expectedbreadcrumb sites in the client, error-message tests likepost_400_propagates_backend_error_envelope_message).is_backend_user_state_errorwalkserr.chain()so future.context("…")wraps at any call site can't silently re-enable Sentry capture.anyhow::Error→report_error→ Sentry. Pinned bypost_500_internal_error_is_not_marked_user_stateso the matcher can't widen by accident.run_one_tool(Err(_)arm) is intentionally untouched — timeout strings don't carry classifiable substrings, and timeouts are real ops events.Submission Checklist
diff-cover) meet the gate enforced by.github/workflows/pr-ci.yml. 7 new client tests + 1 helper test pin the four wrapped bail sites, Display preservation, helper-direct + chain-walk discrimination, and 5xx negative case.docs/TEST-COVERAGE-MATRIX.md(no new user-visible feature; routes existing user-state failures through the existing classifier at a new boundary)## Relateddocs/RELEASE-MANUAL-SMOKE.md)Closes #NNNin the## RelatedsectionImpact
src/openhuman/integrations/client.rs+src/openhuman/agent/harness/engine/tools.rs. No frontend, no Tauri shell, no schema changes.expected_error_kindcall + downcast per failed integrations request (string scan +Box<dyn Any>downcast). Negligible vs. the HTTP round-trip cost of the failure itself.Related
web_search_toolat all when the wallet is empty — strictly orthogonal to this PR, which fixes the observability noise that the retry produces.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/sentry-tauri-rust-5kg-tool-user-stateValidation Run
pnpm --filter openhuman-app format:checkpnpm typecheckcargo test --manifest-path Cargo.toml --lib integrations::client(26/26 pass, including 7 new),cargo test --manifest-path Cargo.toml --lib tool_loop(24/24 pass, no regression),cargo test --manifest-path Cargo.toml --lib observability::(140/140 pass, classifier contract preserved)cargo fmt --manifest-path Cargo.toml -- --checkclean;cargo check --manifest-path Cargo.toml --testscleanValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
run_one_tool.Parity Contract
IntegrationClient::{post,get}continues to receiveanyhow::Result<T>with the sameDisplaystring; the typed marker is purely additive (downcast-only). 5xx and unknown errors keep theirreport_errorcapture path.post_500_internal_error_is_not_marked_user_statepins the 5xx escape;is_backend_user_state_error_finds_typed_marker_through_context_wrapspins the chain walk so future.context(…)calls can't silently re-enable capture.Duplicate / Superseded PR Handling
gh pr list --search "TAURI-RUST-5KG OR 6360"before startingSummary by CodeRabbit
Release Notes
Bug Fixes
Tests