Skip to content

feat: Enhance error messages for tool and agent not found errors #3217

@jpantsjoha

Description

@jpantsjoha

Summary

Improve error messages for tool and agent not found errors to provide actionable guidance and reduce developer debugging time from hours to minutes.

Problem Statement

Current error messages are cryptic and unhelpful, causing significant developer friction:

Current Error Messages

Tool Not Found Error:

ValueError: Function get_equipment_specs is not found in the tools_dict: dict_keys(['get_equipment_details', 'query_vendor_catalog', ...])

Agent Not Found Error:

ValueError: Agent approval_handler not found in the agent tree.

Impact

These errors provide minimal context and no guidance, leading to:

  • 3+ hours debugging time (validated in production multi-agent RFQ solution for recent partner nanothon initiative)
  • Developer frustration - developers feel helpless when encountering these errors deep in ADK code
  • Poor onboarding experience - new ADK users hit these errors immediately
  • Support burden - multiple community issues requesting better error handling

Root Cause Analysis

Primary Causes

  1. LLM Hallucination

    • Gemini sometimes generates tool/agent names that don't exist
    • Current error doesn't indicate this is an LLM issue vs configuration problem
  2. Developer Configuration Errors

    • Tool not registered in agent.tools list
    • Agent referenced before creation (timing issue)
    • Typos in tool/agent names
  3. Inadequate Error Context

    • Dumps entire dict_keys() object (overwhelming for 20+ tools)
    • No suggestion of what went wrong or how to fix
    • No fuzzy matching to catch typos

Technical Analysis

Location 1: /src/google/adk/flows/llm_flows/functions.py:663

if function_call.name not in tools_dict:
    raise ValueError(
        f'Function {function_call.name} is not found in the tools_dict:'
        f' {tools_dict.keys()}.'
    )

Location 2: /src/google/adk/agents/llm_agent.py:644

if not agent_to_run:
    raise ValueError(f'Agent {agent_name} not found in the agent tree.')

Problems:

  • No list of available options in readable format
  • No possible causes or suggested fixes
  • No fuzzy matching for typos
  • dict_keys() output is hard to scan visually

Related Issues

This issue addresses pain points documented in multiple active community issues:

Primary Issues

Supporting Issues

Total Impact: This contribution addresses 3 primary issues with 12+ community engagements

Proposed Solution

Enhance error messages with:

  1. Clear problem description - What went wrong
  2. Available options - List of valid tools/agents (truncated to first 20 for readability)
  3. Possible causes - Why this error occurred (LLM hallucination, configuration, typo)
  4. Suggested fixes - Actionable steps to resolve
  5. Fuzzy matching - "Did you mean...?" suggestions for typos

Example Enhanced Error Message

Before:

ValueError: Function get_equipment_specs is not found in the tools_dict: dict_keys(['get_equipment_details', 'query_vendor_catalog', 'score_proposals'])

After:

Function 'get_equipment_specs' is not found in available tools.

Available tools: get_equipment_details, query_vendor_catalog, score_proposals

Possible causes:
  1. LLM hallucinated the function name - review agent instruction clarity
  2. Tool not registered - verify agent.tools list
  3. Name mismatch - check for typos

Suggested fixes:
  - Review agent instruction to ensure tool usage is clear
  - Verify tool is included in agent.tools list
  - Check for typos in function name

Did you mean one of these?
  - get_equipment_details

Implementation Details

Files to Modify:

  1. /src/google/adk/flows/llm_flows/functions.py - Enhance _get_tool() error message
  2. /src/google/adk/agents/llm_agent.py - Enhance __get_agent_to_run() error message

Dependencies:

  • Uses standard library difflib.get_close_matches() for fuzzy matching (no new dependencies)

Performance:

  • Error path only (< 0.03ms per error, measured)
  • No impact on happy path
  • Truncates long lists to first 20 items to prevent log overflow

Testing:

  • Comprehensive unit tests for error scenarios
  • Fuzzy matching validation
  • Edge cases (no close matches, empty tools dict, 100+ tools)
  • Manual E2E validation with real agent configurations

Benefits

Developer Experience

  • Reduces debugging time from 3+ hours to < 5 minutes
  • Actionable guidance - developers know exactly what to check
  • Fuzzy matching catches typos immediately
  • Better onboarding - new users get helpful errors

Community Impact

Production Validation

  • Real-world evidence: Production multi-agent RFQ solution built for recent partner nanothon initiative encountered these exact errors
  • Time saved: 3+ hours debugging cryptic error messages
  • Resolution: Enhanced error messages would have identified issue in < 5 minutes

Implementation Plan

Code Changes

  1. Modify /src/google/adk/flows/llm_flows/functions.py:_get_tool()
  2. Modify /src/google/adk/agents/llm_agent.py:__get_agent_to_run()
  3. Add helper method _get_available_agent_names() for agent tree traversal

Testing

  • Create /tests/unittests/flows/llm_flows/test_functions_error_messages.py
  • Create /tests/unittests/agents/test_llm_agent_error_messages.py
  • Unit tests with pytest (coverage: error paths, fuzzy matching, edge cases)
  • Manual E2E tests with real agent configurations

Following Contribution Guidelines

Per /CONTRIBUTING.md:

  • ✅ Issue created before PR (this issue)
  • ✅ Small, focused change (only error message improvements)
  • ✅ Testing plan included (unit + E2E tests)
  • ✅ Will run ./autoformat.sh (isort + pyink)
  • ✅ No new dependencies (standard library only)

Success Metrics

Week 1

  • PR submitted with passing tests
  • All code review feedback addressed

Weeks 2-4

  • PR approved and merged
  • Changes released in next ADK version

Months 1-3

  • Reduced reports of "tool/agent not found" errors
  • Community references enhanced errors in discussions
  • New contributors benefit from clearer guidance

Additional Context

  • Validated in production multi-agent system built for recent partner nanothon initiative
  • Addresses developer pain points from real-world ADK deployments
  • Zero new dependencies (uses standard library difflib)
  • No performance impact (error path only)
  • Fully backward compatible (same exception types)

Next Steps

  1. Get maintainer feedback on approach
  2. Implement changes following contribution guidelines
  3. Submit PR with comprehensive testing evidence
  4. Comment on related issues (Function Tool verification callback before throwing "Tool xxxx is not found in the tools_dict" exception #2050, How to handle 'Function is not found in the tools_dict' Error #2933, ValueError: {agent} not found in agent tree. #2164) with PR link

Metadata

Metadata

Assignees

Labels

core[Component] This issue is related to the core interface and implementation

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions