-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Retry on failure functionality #2221
Comments
Thanks for creating the issue! |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
Hi! |
Hey @Vaishakh-SM , I think nobody is working on this, but @xingyaoww was thinking about adding multiple runs to evaluation. I think that would be a parallel effort though, because it would involve running multiple times and picking the best one, as opposed to restarting when the first try didn't work. If you'd be interested in taking a look it'd be welcome! |
This seems like an interesting problem! I'll take a look and get back to this sometime this week. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
@neubig this is a really old issue. Just want to make sure, we haven't implemented this yet, right? |
Yep, @xingyaoww is working on a critic that could help implement this. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
Using OpenHands now for a whole project, I just can say a retry in general would be nice. I've seen several failures, when things could not be replaced or far more often, if they could not be generated because of rate limits of the models. The bigger your codebase gets, the more tokens are used and the more often you hit the rate limits. |
What problem or use case are you trying to solve?
Sometimes models fail to do their job correctly, and we would benefit from starting all over from the beginning. There are a few examples of this in the agent literature:
Describe the UX of the solution you'd like
Ideally, this would be something that could be implemented in a general way, so that we could implement different strategies with a shared interface. For instance:
Then, when using OpenDevin, we could choose an option that says "retry N times when you get stuck", and select the strategy that is used to do so.
Do you have thoughts on the technical implementation?
The actual reset strategies would vary based on the task. For instance:
initialize
: save the current git commit of the repositorycommit_id
verify
: tests+linting passreset
:git checkout commit_id
message_on_failure
: no-opinitialize
: save the current web pageinitial_page
verify
: the reward model is positivereset
:goto(initial_page)
message_on_failure
: reflexion promptThis could either be integrated into OpenDevin, allowing for retries in the main app as well
Additional context:
The text was updated successfully, but these errors were encountered: