-
Notifications
You must be signed in to change notification settings - Fork 1.6k
preliminary browser reflection implementation #3138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
@nitpicker55555, here's my initial implementation. Let me know what you think of my approach, I also included an example file to show how it works, will deal with further refinement (exception handling, etc) in the next commit! |
Thanks for your contribution! Have you test it in wordle game or https://github.com/MinorJerry/WebVoyager/blob/main/data/WebVoyager_data.jsonl top 10 questions? |
will do! |
aa5652f
to
0ba17cb
Compare
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this empty file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on the reflection wrapper feature. I've reviewed the implementation and have some suggestions.
The current approach adds significant complexity by intercepting every browser action and asking an LLM whether to proceed or change the action. This doubles the API calls and introduces unpredictable behavior. Additionally,
clearing the agent's conversation history before each action (self.agent.reset()) removes valuable context that the agent needs to maintain state.
I'd suggest a simpler approach: instead of intercepting execution, we could add optional parameters like thinking and next_goal to the browser action methods. Here's how it could work:
def add_reasoning_params(func):
"""Add optional reasoning parameters without changing execution flow"""
@wraps(func)
async def wrapper(self, *args, thinking: Optional[str] = None,
next_goal: Optional[str] = None, **kwargs):
# Log reasoning if provided
if thinking:
logger.info(f"[{func.__name__}] Thinking: {thinking}")
if next_goal:
logger.info(f"[{func.__name__}] Next goal: {next_goal}")
# Execute original function without interference
return await func(self, *args, **kwargs)
# Update docstring to include new parameters
if func.__doc__:
additional_docs = """
Additional Parameters:
thinking (Optional[str]): Your reasoning for this action.
next_goal (Optional[str]): What you plan to do after this action.
"""
wrapper.__doc__ = func.__doc__.rstrip() + "\n" + additional_docs
return wrapper
This would allow agents to provide their reasoning when calling the toolkit:
await toolkit.browser_click(
ref="submit_button",
thinking="Form is complete, submitting to server",
next_goal="Wait for confirmation page to load"
)
0ba17cb
to
f8af6ab
Compare
@nitpicker55555 new implementation uploaded run |
@nitpicker55555 Latest experiment setup, results were 4 incorrect out of 50 for reflection mechanism, 5 incorrect for non-reflective |
Description
This is a preliminary implementation of a reflection mechanism to be included in the hybrid_browser-toolkit
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.Fixes #issue-number
in the PR description (required)pyproject.toml
anduv lock
If you are unsure about any of these, don't hesitate to ask. We are here to help!