Skip to content

fix(integrations): type backend user-state errors so tool runner skips Sentry (TAURI-RUST-5KG)#3318

Closed
CodeGhost21 wants to merge 1 commit into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-5kg-tool-user-state
Closed

fix(integrations): type backend user-state errors so tool runner skips Sentry (TAURI-RUST-5KG)#3318
CodeGhost21 wants to merge 1 commit into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-5kg-tool-user-state

Conversation

@CodeGhost21
Copy link
Copy Markdown
Contributor

@CodeGhost21 CodeGhost21 commented Jun 3, 2026

Summary

  • web_search_tool flooded Sentry with 1860 events / 9 users (TAURI-RUST-5KG) when the backend returned 400 Insufficient balance from /agent-integrations/parallel/search. The agent retried 19× per turn and every retry was captured as a distinct error.
  • The integrations breadcrumb classifier already knew Insufficient balance was a user-state error (fix(observability): classify DeepSeek 'Insufficient Balance' 402 as user-state (Sentry TAURI-RUST-4ZF) #2809 / TAURI-RUST-4ZF), but the distinction was thrown away when IntegrationClient::{post,get} flattened every non-2xx into an opaque anyhow::Error — by the time the failure reached agent::harness::engine::tools::run_one_tool, the always-capture report_error path ran.
  • Type the error at the integrations boundary so the tool runner can downcast and route user-state failures to the warn-only path. Fix scales to every tool that calls the integrations backend (search/parallel, tinyfish, composio, twilio, apify, google_places, stock_prices, web3, linkedin_enrichment) — not just web_search_tool.

Problem

Sentry TAURI-RUST-5KGtool=web_search_tool, operation=execute, iteration=19, Windows, releases 0.56.00.57.13. The classifier added by #2809 already recognised Insufficient balance as BackendUserError / BudgetExhausted, and the integrations.post breadcrumb correctly demoted to a warn — but the ?-propagated error bubbled into the per-tool error handler at src/openhuman/agent/harness/engine/tools.rs:308, which used the always-capture report_error (not report_error_or_expected). One out-of-credits user → one Sentry event per retry → 19 events per turn.

The architectural smell is that the HTTP client already classified the failure at the breadcrumb site, then immediately discarded the classification when bailing. Routing the message-string back through expected_error_kind at every call site is exactly the bandaid that lets this regress.

Solution

Type the error at the boundary that knows the difference.

src/openhuman/integrations/client.rs

  • Add BackendUserStateError (Display + std::error::Error) and is_backend_user_state_error(&anyhow::Error) helper. The struct's Display impl is the underlying bail message verbatim, so wrapping is transparent to callers that only stringify the error.
  • Add internal classify_as_user_state(&str) -> Option<anyhow::Error> — runs expected_error_kind and wraps with BackendUserStateError when the kind is BackendUserError, BudgetExhausted, or ProviderUserState (the three buckets that represent clean user-actionable failures). Other expected kinds (NetworkUnreachable, SessionExpired, …) stay plain so existing retry / re-auth machinery is untouched.
  • Wrap all four bail sites (post non-2xx, post envelope-failure, get non-2xx, get envelope-failure) with the typed marker when classification matches; fall through to anyhow::bail! otherwise.

src/openhuman/integrations/mod.rs

  • Re-export BackendUserStateError + is_backend_user_state_error.

src/openhuman/agent/harness/engine/tools.rs

  • In the Ok(Err(e)) arm of run_one_tool, downcast via is_backend_user_state_error(&e). When matched: tracing::warn! (no Sentry capture), surface the message to the LLM via the existing (text, false) failure tuple. The repeated-failure circuit breaker still records the failure so identical retries trip out. When not matched: existing report_error path runs unchanged.

Design notes

  • Display string preserved verbatim → zero behaviour change for callers that just format!("{e:#}") (UI toasts, the four other report_error_or_expected breadcrumb sites in the client, error-message tests like post_400_propagates_backend_error_envelope_message).
  • is_backend_user_state_error walks err.chain() so future .context("…") wraps at any call site can't silently re-enable Sentry capture.
  • 5xx, transport bugs, unknown envelope shapes all stay plain anyhow::Errorreport_error → Sentry. Pinned by post_500_internal_error_is_not_marked_user_state so the matcher can't widen by accident.
  • Timeout branch in run_one_tool (Err(_) arm) is intentionally untouched — timeout strings don't carry classifiable substrings, and timeouts are real ops events.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/pr-ci.yml. 7 new client tests + 1 helper test pin the four wrapped bail sites, Display preservation, helper-direct + chain-walk discrimination, and 5xx negative case.
  • N/A: behaviour-only change — Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md (no new user-visible feature; routes existing user-state failures through the existing classifier at a new boundary)
  • N/A: behaviour-only change — All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • N/A: not a release-cut surface — Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md)
  • N/A: Sentry-tracked issue (TAURI-RUST-5KG), no GitHub issue — Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime: desktop core only — src/openhuman/integrations/client.rs + src/openhuman/agent/harness/engine/tools.rs. No frontend, no Tauri shell, no schema changes.
  • User-visible: identical error messages still flow to the LLM and to error toasts — only the Sentry-side behaviour changes.
  • Performance: one extra expected_error_kind call + downcast per failed integrations request (string scan + Box<dyn Any> downcast). Negligible vs. the HTTP round-trip cost of the failure itself.
  • Security/migration/compat: none. Existing callers are stringify-only; the typed marker is purely additive.

Related

  • Closes: N/A (Sentry TAURI-RUST-5KG, no linked GitHub issue)
  • Follow-up PR(s)/TODOs: a complementary backend-driven pre-flight balance check would stop the agent from picking web_search_tool at all when the wallet is empty — strictly orthogonal to this PR, which fixes the observability noise that the retry produces.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A (Sentry TAURI-RUST-5KG)

Commit & Branch

  • Branch: fix/sentry-tauri-rust-5kg-tool-user-state
  • Commit SHA: see PR head

Validation Run

  • N/A: Rust-only change — pnpm --filter openhuman-app format:check
  • N/A: Rust-only change — pnpm typecheck
  • Focused tests: cargo test --manifest-path Cargo.toml --lib integrations::client (26/26 pass, including 7 new), cargo test --manifest-path Cargo.toml --lib tool_loop (24/24 pass, no regression), cargo test --manifest-path Cargo.toml --lib observability:: (140/140 pass, classifier contract preserved)
  • Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml -- --check clean; cargo check --manifest-path Cargo.toml --tests clean
  • N/A: shell not touched — Tauri fmt/check (if changed)

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: user-state failures from the integrations backend (insufficient balance, missing required field, toolkit not enabled) no longer create Sentry events when bubbled through run_one_tool.
  • User-visible effect: none — error messages still flow to the LLM and to UI toasts verbatim. Only Sentry-side behaviour changes.

Parity Contract

  • Legacy behavior preserved: every existing caller of IntegrationClient::{post,get} continues to receive anyhow::Result<T> with the same Display string; the typed marker is purely additive (downcast-only). 5xx and unknown errors keep their report_error capture path.
  • Guard/fallback/dispatch parity checks: post_500_internal_error_is_not_marked_user_state pins the 5xx escape; is_backend_user_state_error_finds_typed_marker_through_context_wraps pins the chain walk so future .context(…) calls can't silently re-enable capture.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none — verified via gh pr list --search "TAURI-RUST-5KG OR 6360" before starting
  • Canonical PR: this PR
  • Resolution: N/A

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved error handling to better distinguish between user-state errors (e.g., insufficient balance, disabled features) and system errors. User-state errors are now logged as warnings rather than critical failures, enabling more accurate error tracking and reporting.
  • Tests

    • Added regression tests for backend error classification and handling.

…s Sentry (TAURI-RUST-5KG)

`web_search_tool` flooded Sentry with 1860 events / 9 users for backend
400 "Insufficient balance" — the agent retried 19 times in a single
turn, and each `Ok(Err(e))` in `agent::harness::engine::tools::run_one_tool`
called the always-capture `report_error`. The integrations breadcrumb
classifier already knew "Insufficient balance" was a user-state error
(see tinyhumansai#2809 / TAURI-RUST-4ZF), but that distinction was lost at the
`?` boundary because `IntegrationClient::{post,get}` flattened every
non-2xx into an opaque `anyhow::Error`.

Fix it at the architectural boundary that knows the difference:

- Add `BackendUserStateError` (Display + std::error::Error) and the
  `is_backend_user_state_error(&anyhow::Error)` helper to the
  integrations client. Wrap the four bail sites (post/get × non-2xx /
  envelope-failure) with the typed marker when
  `expected_error_kind` classifies the message as `BackendUserError`,
  `BudgetExhausted`, or `ProviderUserState`. Display string is
  preserved verbatim, so the ~30 existing callers that stringify the
  error see no behaviour change.

- In `agent::harness::engine::tools::run_one_tool`, downcast the
  bubbled error: when the typed marker is present, log a warn
  breadcrumb (no Sentry capture) and surface the message to the LLM
  via the existing `(text, false)` failure tuple. The circuit breaker
  still records the failure so 19 retries with the same user-state
  message trip out. Real failures (5xx, transport bugs, unknown
  envelope shapes) keep their `report_error` Sentry path unchanged.

Scope: every tool that calls the integrations backend benefits —
search/parallel/tinyfish, all composio, twilio, apify, google_places,
stock_prices, web3, linkedin_enrichment — not just `web_search_tool`.

Tests: 7 new cases in `client_tests.rs` pin both halves of the
contract: typed wrapping fires for 400/insufficient-balance,
403/toolkit-not-enabled, and 2xx-envelope user-state failures; 500s
remain un-typed; Display verbatim; `is_backend_user_state_error`
walks the chain so `.context()` wraps don't silently re-enable
capture.
@CodeGhost21 CodeGhost21 requested a review from a team June 3, 2026 20:04
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces typed backend user-state error classification to route specific integration failures (insufficient balance, disabled toolkits) away from observability reporting. The HTTP client detects these errors via status codes and response patterns, wraps them in a BackendUserStateError marker, and the tool executor detects this marker to log warnings instead of reporting to Sentry.

Changes

Backend user-state error classification and routing

Layer / File(s) Summary
Typed error marker and classification helpers
src/openhuman/integrations/client.rs
Introduces BackendUserStateError struct and classify_as_user_state / is_backend_user_state_error helpers to detect user-state failures based on HTTP status and expected_error_kind patterns from backend responses.
HTTP client POST and GET error classification
src/openhuman/integrations/client.rs
Both post and get methods now build a shared bail message and use classification to return the typed BackendUserStateError on non-2xx responses and success: false envelopes matching user-state patterns; other errors bail unchanged.
Public module exports
src/openhuman/integrations/mod.rs
Module re-exports BackendUserStateError and is_backend_user_state_error to the public API.
Tool executor error routing
src/openhuman/agent/harness/engine/tools.rs
run_one_tool now detects backend user-state errors and logs warnings without observability reporting; all other errors continue reporting via report_error as before.
Regression and unit test coverage
src/openhuman/integrations/client_tests.rs
Tests verify 400/403 user-state responses are typed and detected correctly, 500 errors are not marked as user-state, 200+success: false envelopes are typed correctly, and marker detection works through anyhow context wrapping.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tinyhumansai/openhuman#1795: Overlaps in modifying IntegrationClient::post/get error handling for backend envelope failures, but this PR uses a typed marker checked in run_one_tool to skip observability reporting, while the retrieved PR uses report_error_or_expected with ExpectedErrorKind::ProviderUserState.

Suggested labels

agent, rust-core, sentry-traced-bug, bug

Suggested reviewers

  • graycyrus
  • oxoxDev
  • M3gA-Mind

Poem

🐰 A typed error hops through the integration stack,
User-state failures now dodge the report track,
Backend blues skip the Sentry climb,
Tool executor warns with perfect timing,
Error routing works like a bunny's quest—swift and true!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: introducing a typed marker for backend user-state errors to prevent tool runner from reporting them to Sentry.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. sentry-traced-bug Bug identified via Sentry triage bug labels Jun 3, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/openhuman/integrations/client_tests.rs (1)

500-531: ⚡ Quick win

Add the missing get envelope regression case.

The typed boundary is pinned for post non-2xx, post envelope, and get non-2xx, but not for IntegrationClient::get's 200 + success:false branch. That leaves one of the four new return sites in client.rs unguarded against drift.

🧪 Suggested test shape
+#[tokio::test]
+async fn get_envelope_user_state_failure_returns_typed_backend_user_state_error() {
+    let app = Router::new().route(
+        "/agent-integrations/composio/connections",
+        get(|| async {
+            (
+                StatusCode::OK,
+                Json(json!({
+                    "success": false,
+                    "error": "Toolkit \"slack\" is not enabled"
+                })),
+            )
+                .into_response()
+        }),
+    );
+    let base = start_mock_backend(app).await;
+    let client = client_for(base);
+    let err = client
+        .get::<serde_json::Value>("/agent-integrations/composio/connections")
+        .await
+        .expect_err("envelope-failure must surface as Err");
+
+    assert!(
+        is_backend_user_state_error(&err),
+        "GET envelope user-state failure must carry BackendUserStateError marker; got: {err:#}"
+    );
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/integrations/client_tests.rs` around lines 500 - 531, Add a
mirror regression test for the GET envelope case: create a new async
#[tokio::test] (e.g.,
get_envelope_user_state_failure_returns_typed_backend_user_state_error) that
starts a mock Router route for "/agent-integrations/composio/execute" which
returns StatusCode::OK with Json success: false and the same error message, then
call client.get(...) via IntegrationClient::get (client.get) and assert the
returned Err carries the BackendUserStateError marker using
is_backend_user_state_error(&err); this covers the missing 200+success:false
branch for IntegrationClient::get to match the existing
post_envelope_user_state_failure_returns_typed_backend_user_state_error test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/integrations/client_tests.rs`:
- Around line 500-531: Add a mirror regression test for the GET envelope case:
create a new async #[tokio::test] (e.g.,
get_envelope_user_state_failure_returns_typed_backend_user_state_error) that
starts a mock Router route for "/agent-integrations/composio/execute" which
returns StatusCode::OK with Json success: false and the same error message, then
call client.get(...) via IntegrationClient::get (client.get) and assert the
returned Err carries the BackendUserStateError marker using
is_backend_user_state_error(&err); this covers the missing 200+success:false
branch for IntegrationClient::get to match the existing
post_envelope_user_state_failure_returns_typed_backend_user_state_error test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d531ec68-9042-4745-acef-c61e323614ca

📥 Commits

Reviewing files that changed from the base of the PR and between db0307d and 7bcd4d7.

📒 Files selected for processing (4)
  • src/openhuman/agent/harness/engine/tools.rs
  • src/openhuman/integrations/client.rs
  • src/openhuman/integrations/client_tests.rs
  • src/openhuman/integrations/mod.rs

@senamakel
Copy link
Copy Markdown
Member

No this is not the way we fix bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. bug rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants