Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Integrate WebCanvas #33

Open
1 task done
dandansamax opened this issue Sep 5, 2024 · 0 comments
Open
1 task done

[Feature Request] Integrate WebCanvas #33

dandansamax opened this issue Sep 5, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@dandansamax
Copy link
Collaborator

Required prerequisites

  • I have searched the Issue Tracker that this hasn't already been reported. (+1 or comment there if it has.)

Motivation

WebCanvas: Benchmarking Web Agents in Online Environments is a advanced web agent benchmark framework that shares a similar idea with CRAB in some perspectives.

WebCanva provides three main components:

  1. A novel evaluation metric which reliably capture critical intermediate actions or states necessary for task completions while disregarding noise caused by insignificant events or changed web-elements.
  2. A benchmark dataset called Mind2Web-Live, a refined version of original Mind2Web static dataset containing 542 tasks with 2439 intermediate evaluation states.
  3. Lightweight and generalizable annotation tools and testing pipelines that enables the community to collect and maintain the high-quality, up-to-date dataset.

We should consider integrating WebCanvas dataset which is perfectly fit into CRAB.

Solution

No response

Additional context

No response

@dandansamax dandansamax added the enhancement New feature or request label Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant