-
Notifications
You must be signed in to change notification settings - Fork 215
Enhance robustness of plan-execute argument passing #231
Description
Summary
The plan-execute workflow has several silent failure modes in argument passing between steps. When things go wrong, errors are swallowed and empty dicts are returned — making debugging very hard and causing downstream tool calls to fail with missing-parameter errors rather than a clear root cause.
Identified Issues
1. Silent JSON parse failure in planner (planner.py:77-78)
When the LLM produces a malformed #ArgsN: line, json.loads fails and the args silently become {}. No warning is logged.
except json.JSONDecodeError:
args[n] = {} # completely silent2. _has_placeholders() only checks top-level string values (executor.py:184-189)
Nested placeholders inside dicts or lists are not detected, so those steps bypass LLM resolution and call the tool with raw {step_N} strings.
3. Missing step references silently ignored in context (executor.py:221-224)
If a step references {step_N} but step N failed or doesn't exist, it's silently dropped from the context fed to the LLM. The LLM then receives incomplete context and may produce wrong or empty resolved values.
context_text = "\n".join(
f"Step {n}: {context[n].response}"
for n in sorted(referenced)
if n in context # missing step → no context, no warning
)4. Unparseable LLM resolution response silently becomes {} (executor.py:264-265)
If _parse_json can't extract valid JSON from the LLM's resolution response, it returns {}. The resolved args dict then lacks all placeholder-sourced keys — the subsequent tool call fails with a confusing missing-parameter error.
5. No retry or fallback for LLM resolution call (executor.py:236)
A single LLM call resolves all placeholders. Transient LLM failures or empty responses cascade directly to tool failure with no retry.
Proposed Fix
- Log warnings at all silent fallback sites (
planner.py,executor.py) so failures are visible in logs. - Recurse into nested structures in
_has_placeholders()and_resolve_args_with_llm()to handle dicts/lists containing placeholders. - Warn when referenced steps are missing from context, and surface this in the
StepResult.errorfield. - Warn when
_parse_jsonfalls back to{}, including the raw LLM output (truncated) for diagnostics. - Add a single retry for the LLM resolution call on empty/unparseable response.
Impact
These are all silent data-loss bugs in the hot path of every multi-step plan. Fixing them improves debuggability and correctness for any plan with inter-step argument dependencies.