Skip to content

refactor(sdk): extract LLM response classification and dispatch from Agent.step()#2743

Open
VascoSch92 wants to merge 4 commits intomainfrom
vasco/refactoring
Open

refactor(sdk): extract LLM response classification and dispatch from Agent.step()#2743
VascoSch92 wants to merge 4 commits intomainfrom
vasco/refactoring

Conversation

@VascoSch92
Copy link
Copy Markdown
Contributor

@VascoSch92 VascoSch92 commented Apr 7, 2026

Summary

Extracts the monolithic LLM response handling block from Agent.step() into a pure classifier function and a dispatch mixin, following the existing CriticMixin pattern.

What changed

  • New file response_dispatch.py containing:
  • LLMResponseType (StrEnum) — classifies responses into TOOL_CALLS, CONTENT, REASONING_ONLY, EMPTY
  • classify_response() — pure function, no side effects, unit-testable without mocking the agent
  • _AgentProtocol — typed Protocol documenting the mixin's contract with its host class
  • ResponseDispatchMixin — handler methods for each response type, mixed into Agent
  • agent.py — the ~85-line if/else chain in step() replaced by a 10-line match dispatch. Agent now inherits from ResponseDispatchMixin. Net reduction of ~70 lines.
  • New test file test_response_dispatch.py — 15 parametrized classifier tests covering all 8 rows of the response matrix + edge cases, plus 3 mixin integration tests verifying dispatch behavior (FINISHED status, corrective nudge).

What did NOT change

All existing behavior is preserved exactly. No new features, no bug fixes — this is a pure structural refactoring.

Motivation

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:95e3a6b-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-95e3a6b-python \
  ghcr.io/openhands/agent-server:95e3a6b-python

All tags pushed for this build

ghcr.io/openhands/agent-server:95e3a6b-golang-amd64
ghcr.io/openhands/agent-server:95e3a6b-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:95e3a6b-golang-arm64
ghcr.io/openhands/agent-server:95e3a6b-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:95e3a6b-java-amd64
ghcr.io/openhands/agent-server:95e3a6b-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:95e3a6b-java-arm64
ghcr.io/openhands/agent-server:95e3a6b-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:95e3a6b-python-amd64
ghcr.io/openhands/agent-server:95e3a6b-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:95e3a6b-python-arm64
ghcr.io/openhands/agent-server:95e3a6b-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:95e3a6b-golang
ghcr.io/openhands/agent-server:95e3a6b-java
ghcr.io/openhands/agent-server:95e3a6b-python

About Multi-Architecture Support

  • Each variant tag (e.g., 95e3a6b-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 95e3a6b-python-amd64) are also available if needed

@VascoSch92 VascoSch92 requested a review from all-hands-bot April 7, 2026 14:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/agent
   agent.py2992093%99, 280, 284, 483–485, 487, 517–518, 525–526, 614, 879–880, 882, 911, 919–920, 954, 961
   response_dispatch.py64493%190, 275–277
TOTAL22584646771% 

all-hands-bot

This comment was marked as outdated.

@VascoSch92 VascoSch92 requested a review from all-hands-bot April 7, 2026 15:24
all-hands-bot

This comment was marked as outdated.

@VascoSch92 VascoSch92 requested a review from all-hands-bot April 7, 2026 16:05
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Textbook refactoring. Reduces 85 lines of branching to 10 lines of clean dispatch. Pure classifier is independently testable. Tests are real (not mocks). No technical issues found.

Note: Per repo policy, agent loop changes require human maintainer eval verification before merge (previously flagged and acknowledged).

@VascoSch92 VascoSch92 marked this pull request as ready for review April 7, 2026 16:18
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Textbook refactoring. Reduces 85 lines of branching to 10 lines of clean dispatch. Pure classifier is independently testable. Tests are real (not mocks). No technical issues found.

Note: Per repo policy, agent loop changes require human maintainer eval verification before merge (previously flagged and acknowledged).

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @VascoSch92, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Apr 13, 2026

@OpenHands /codereview-roasted on this pr, post your feedback as a review with gh api and event.

@openhands-ai
Copy link
Copy Markdown

openhands-ai bot commented Apr 13, 2026

I'm on it! enyst can track my progress at all-hands.dev

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Taste Rating: Good taste

This is the right kind of refactor. The data flow is simpler now: classify once, dispatch once, keep the behavior in small handlers instead of burying it in one long branch pile. No new special cases, no API churn, and the tests exercise the real Agent.step() path instead of only asserting that mocks were called.

I reviewed the diff against main, checked the new classifier/dispatch split, and ran:

uv run pytest tests/sdk/agent/test_response_dispatch.py

Result: 19 passed.

Because this change touches Agent.step() and response/tool-dispatch control flow, I'm leaving a COMMENT review rather than an approval; per repo policy, a human maintainer should make the final call after lightweight eval/CI confidence.

VERDICT: ✅ Worth merging pending human eval-risk sign-off

KEY INSIGHT: The refactor improves the structure by making response classification the single decision point, while keeping behavior local to dedicated handlers and preserving the existing control flow.

AI disclosure: This review was generated by an AI assistant (OpenHands) on behalf of the user.

@openhands-ai
Copy link
Copy Markdown

openhands-ai bot commented Apr 13, 2026

Final summary of new work since the last summary:

Checklist:

  • Reviewed the PR in the requested /codereview-roasted style
  • Used GitHub API / gh api to post the feedback as a review with COMMENT event
  • Verified the relevant targeted tests pass
  • Made no code changes

Conciseness:

  • No repository files were modified
  • No extraneous changes were introduced

Status:

  • The request was completely addressed
  • All instructions were followed faithfully

@VascoSch92
Copy link
Copy Markdown
Contributor Author

@enyst I think we are good to merge or? :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants