-
Notifications
You must be signed in to change notification settings - Fork 140
[Openhands] we should add test to test this init_state will actually modify state in-place #1852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This implements the TODO to test that Agent.init_state() modifies the ConversationState in-place by adding a SystemPromptEvent through the on_event callback. Tests verify: - State object identity is preserved after init_state - Events container identity is preserved - SystemPromptEvent is correctly added with agent's system message - SystemPromptEvent includes agent's tools - Duplicate init_state calls don't add duplicate SystemPromptEvents - Integration via Conversation class also modifies state in-place Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found several issues that need attention, primarily around the test design pattern where callbacks are doing the work that should be verified.
|
|
||
| def on_event(event): | ||
| events_collected.append(event) | ||
| state.events.append(event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Critical: The callback is manually appending to state.events, which means the test is doing the work it should be verifying. If the test name "modifies_state_in_place" means init_state should modify the state directly, then the callback should NOT be doing this work. Either:
- Remove the
on_eventcallback and verifyinit_statemodifies state directly (like test 5), OR - Rename the test to clarify it's testing the callback mechanism, not in-place modification by
init_stateitself
This same issue appears in lines 71, 100, and 128.
| assert id(state) == original_state_id, "State object identity should be preserved" | ||
| assert id(state.events) == original_events_id, ( | ||
| "Events container identity should be preserved" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Suggestion: These assertions verify object identity is preserved, which is good. However, if the callback is responsible for appending (line 38), then these assertions are less meaningful - the state object identity is preserved simply because nothing returned a new object. Consider testing without the manual append in the callback.
|
|
||
| tool_names = {tool.name for tool in system_prompt_event.tools} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Suggestion: The agent was initialized with "FinishTool" and "ThinkTool" (CamelCase) but you're asserting lowercase "finish" and "think". While this might be correct if names are normalized, add a comment explaining this expectation or verify it's consistent with the tool registration behavior.
| conv = Conversation( | ||
| agent=agent, | ||
| visualizer=None, | ||
| workspace=str(tmp_path), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Important: Accessing private method _ensure_agent_ready() is a code smell. Consider whether there's a public API to achieve the same test goal, or document why accessing the private method is necessary here.
|
|
||
| def test_init_state_via_conversation_modifies_state_in_place(tmp_path): | ||
| """Test that init_state via Conversation also modifies state in-place. | ||
|
|
||
| This tests the integration path where init_state is called through the | ||
| Conversation class, verifying the state is properly modified. | ||
| """ | ||
| llm = LLM(model="test-model", usage_id="test-llm") | ||
| agent = Agent(llm=llm, tools=[]) | ||
|
|
||
| conv = Conversation( | ||
| agent=agent, | ||
| visualizer=None, | ||
| workspace=str(tmp_path), | ||
| ) | ||
| conv._ensure_agent_ready() | ||
|
|
||
| events_list = list(conv._state.events) | ||
| assert len(events_list) == 1, ( | ||
| "State should have exactly one event after conversation initialization" | ||
| ) | ||
| assert isinstance(events_list[0], SystemPromptEvent), ( | ||
| "The event should be a SystemPromptEvent" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟢 Nit: This test (without a custom callback) successfully demonstrates in-place modification, which suggests tests 1-4 should follow this pattern rather than having the callback do the appending. This test is actually a better model for what "in-place modification" means.
neubig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @subashselvan, thanks for the contribution! Please consider the bot review comments, at least the first one seems valid.
|
[Automatic Post]: It has been a while since there was any activity on this PR. @subashselvan, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
Summary
This PR implements the TODO comment to add tests verifying that
Agent.init_state()modifies theConversationStatein-place.Changes
Added test file
tests/sdk/agent/test_init_state_in_place_modification.pywith the following tests:Related
TODO at
openhands-sdk/openhands/sdk/agent/agent.py:106Testing
All tests pass: