You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Part of the v0.58.7 → HEAD TinyAgents-migration audit (see parent issue). Restores loop-guard behaviors deleted with tool_loop.rs that have no equivalent in the tinyagents crate (verified against crate source — RepeatedToolFailureMiddleware tracks failures only).
Polling-tool exemption: tools legitimately called repeatedly (status polls) were exempt from both guards.
Recoverable-failure headroom collapsed: legacy gave transient/timeout failures extended headroom (8 identical / 12 varied) before halting; the new RepeatedToolFailureMiddleware thresholds are 3/6 (src/openhuman/tinyagents/middleware.rs:1680,1815) — long-running retry-appropriate operations now halt early.
Autonomous iteration-cap lift is a dead knob: autonomous_iter_cap() has zero readers, while task_dispatcher/executor.rs:199 and skill_runtime/run_machinery.rs:228 still call with_autonomous_iter_cap expecting the lift (legacy ops/loop_.rs:73-77). Autonomous task/skill subagents now stop at the normal per-agent cap (e.g. 10) instead of TASK_RUN_MAX_ITERATIONS.
Note: the ~10 legacy tests covering 1–3 and ~10 covering 5 were deleted with the code (see test-parity issue).
Fix plan
Re-implement 1–3 as a seam middleware (e.g. RepeatProgressMiddleware in src/openhuman/tinyagents/middleware.rs): track (tool, args-hash) for successes and output-hash across rounds; on threshold, inject the legacy halt instruction / pause via SteeringCommand::Pause with the standard checkpoint text. Port the polling-tool exemption list.
Re-add the terminal delegated-inference classifier as a first-halt check on delegation-envelope failures.
Wire autonomous_iter_cap() back in where effective_max_iterations is computed for subagent runs (subagent_runner/ops/graph.rs / tinyagents/mod.rs:183) — or delete the setters and thread an explicit cap parameter (preferred: explicit parameter; the task-local was already fragile, and the detached-spawn issue shows why).
Restore the deleted tests for each guard as middleware unit tests + one loop-level e2e per guard.
Acceptance criteria
Identical successful call 3× / identical output 4× halts with a resumable checkpoint; polling tools exempt.
Transient failures get extended headroom; policy-denied trips at 2.
Task-dispatcher and skill runs execute up to TASK_RUN_MAX_ITERATIONS again.
Part of the v0.58.7 → HEAD TinyAgents-migration audit (see parent issue). Restores loop-guard behaviors deleted with
tool_loop.rsthat have no equivalent in the tinyagents crate (verified against crate source —RepeatedToolFailureMiddlewaretracks failures only).Dropped behaviors
RepeatCallGuard([Harness] Add circuit breaker for repeated identical tool calls #4088): halt after 3× identical successful tool-call batches — a model looping on a no-op tool now burns the full iteration budget.RepeatOutputGuard([Harness] Agents enter unproductive loops (repeated tool calls, no progress) #4095): halt after 4× identical tool outputs.RepeatedToolFailureMiddlewarethresholds are 3/6 (src/openhuman/tinyagents/middleware.rs:1680,1815) — long-running retry-appropriate operations now halt early.autonomous_iter_cap()has zero readers, whiletask_dispatcher/executor.rs:199andskill_runtime/run_machinery.rs:228still callwith_autonomous_iter_capexpecting the lift (legacyops/loop_.rs:73-77). Autonomous task/skill subagents now stop at the normal per-agent cap (e.g. 10) instead ofTASK_RUN_MAX_ITERATIONS.Note: the ~10 legacy tests covering 1–3 and ~10 covering 5 were deleted with the code (see test-parity issue).
Fix plan
RepeatProgressMiddlewareinsrc/openhuman/tinyagents/middleware.rs): track (tool, args-hash) for successes and output-hash across rounds; on threshold, inject the legacy halt instruction / pause viaSteeringCommand::Pausewith the standard checkpoint text. Port the polling-tool exemption list.RepeatedToolFailureMiddleware: recognize recoverable classes (timeout/transient markers, reusing the feat(agent): classify tool-call failures + surface on web event — rework of #4413 (backend) #4445 classifier) and give them the legacy 8/12 headroom; keep 3/6 for other failures; restore the 2-repeat fast-trip forPOLICY_DENIED_MARKER.autonomous_iter_cap()back in whereeffective_max_iterationsis computed for subagent runs (subagent_runner/ops/graph.rs/tinyagents/mod.rs:183) — or delete the setters and thread an explicit cap parameter (preferred: explicit parameter; the task-local was already fragile, and the detached-spawn issue shows why).Acceptance criteria
TASK_RUN_MAX_ITERATIONSagain.