fix(eval): handle unevaluated final response v2 results by pragnyanramtha · Pull Request #5728 · google/adk-python

pragnyanramtha · 2026-05-17T00:12:23Z

Summary

Fixes a small aggregation edge case in FinalResponseMatchV2Evaluator: when every per-invocation result is skipped or not evaluated, the evaluator currently divides by zero while computing the overall score.

Root Cause

aggregate_invocation_results() filters out results whose score is None or whose eval_status is NOT_EVALUATED, but it unconditionally computes:

overall_score = num_valid / num_evaluated

If all judge samples fail to produce a usable score, num_evaluated remains 0 and evaluation crashes instead of returning a not-evaluated aggregate result. Other ADK evaluators handle this condition by returning overall_score=None and overall_eval_status=NOT_EVALUATED.

Change

Return an EvaluationResult with overall_score=None and overall_eval_status=NOT_EVALUATED when no FinalResponseMatchV2 invocation results are evaluable.
Add a focused regression test for all-skipped/all-not-evaluated invocation results.

Validation

uv sync --extra test
uv run pytest tests/unittests/evaluation/test_final_response_match_v2.py

Result: 18 passed, 20 warnings.

Full unit suite was not run; this patch is limited to FinalResponseMatchV2 aggregation and its targeted unit test file.

…onse-v2-no-eval-guard

pragnyanramtha · 2026-05-20T18:18:19Z

Refreshed this branch with current main in f61da061.

Validation rerun:

uv run --extra test pytest tests/unittests/evaluation/test_final_response_match_v2.py -q (18 passed)
uv run --extra dev pyink --check src/google/adk/evaluation/final_response_match_v2.py tests/unittests/evaluation/test_final_response_match_v2.py
git diff --check

…onse-v2-no-eval-guard

pragnyanramtha · 2026-05-20T18:49:08Z

Refreshed this branch with current main in f7a83e9b.

Validation rerun:

uv run --extra test pytest tests/unittests/evaluation/test_final_response_match_v2.py -q (18 passed, 20 experimental warnings)
uv run --extra dev pyink --check src/google/adk/evaluation/final_response_match_v2.py tests/unittests/evaluation/test_final_response_match_v2.py
uv run --extra dev isort --check-only src/google/adk/evaluation/final_response_match_v2.py tests/unittests/evaluation/test_final_response_match_v2.py
git diff --check

…onse-v2-no-eval-guard

fix(eval): handle no evaluated final response v2 results

f814359

pragnyanramtha marked this pull request as ready for review May 17, 2026 00:15

Merge branch 'main' into pragnyan/final-response-v2-no-eval-guard

22c8a0f

rohityan self-assigned this May 18, 2026

rohityan and others added 5 commits May 18, 2026 11:40

Merge branch 'main' into pragnyan/final-response-v2-no-eval-guard

1c75271

Merge remote-tracking branch 'upstream/main' into pragnyan/final-resp…

53ce1af

…onse-v2-no-eval-guard

Merge remote-tracking branch 'upstream/main' into pragnyan/final-resp…

060f329

…onse-v2-no-eval-guard

Merge branch 'main' into pragnyan/final-response-v2-no-eval-guard

095c893

Merge branch 'main' into pragnyan/final-response-v2-no-eval-guard

f004759

rohityan added the v2 Affects only 2.0 version label May 19, 2026

pragnyanramtha added 2 commits May 20, 2026 05:03

Merge remote-tracking branch 'upstream/main' into pragnyan/final-resp…

26095e5

…onse-v2-no-eval-guard

Merge remote-tracking branch 'upstream/main' into pragnyan/final-resp…

f61da06

…onse-v2-no-eval-guard

Merge remote-tracking branch 'upstream/main' into pragnyan/final-resp…

f7a83e9

…onse-v2-no-eval-guard

Merge remote-tracking branch 'upstream/main' into pragnyan/final-resp…

ac57c85

…onse-v2-no-eval-guard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(eval): handle unevaluated final response v2 results#5728

fix(eval): handle unevaluated final response v2 results#5728
pragnyanramtha wants to merge 11 commits into
google:mainfrom
pragnyanramtha:pragnyan/final-response-v2-no-eval-guard

pragnyanramtha commented May 17, 2026

Uh oh!

pragnyanramtha commented May 20, 2026

Uh oh!

pragnyanramtha commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pragnyanramtha commented May 17, 2026

Summary

Root Cause

Change

Validation

Uh oh!

pragnyanramtha commented May 20, 2026

Uh oh!

pragnyanramtha commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants