Skip to content

test: add regression tests for act() model override#2205

Open
MaitreyeeDeshmukh wants to merge 1 commit into
browserbase:mainfrom
MaitreyeeDeshmukh:fix/act-model-override
Open

test: add regression tests for act() model override#2205
MaitreyeeDeshmukh wants to merge 1 commit into
browserbase:mainfrom
MaitreyeeDeshmukh:fix/act-model-override

Conversation

@MaitreyeeDeshmukh

@MaitreyeeDeshmukh MaitreyeeDeshmukh commented Jun 7, 2026

Copy link
Copy Markdown

Fixes #1263, #1347

stagehand.act() threw "An unexpected error occurred" when a model override was passed as an option, while stagehand.observe() worked fine with the same option. The underlying fix existed in the codebase but there was no test coverage documenting or protecting the correct behavior.

This PR adds 4 unit tests that exercise ActHandler directly (no browser or LLM needed):

  • String model override is forwarded to resolveLlmClient
  • Object model override is forwarded to resolveLlmClient
  • No override passes undefined (default behavior unchanged)
  • The resolved override client reaches actInference

All tests pass. No existing tests were affected.


Summary by cubic

Adds regression tests to ensure stagehand.act() forwards the per-call model override to resolveLlmClient and uses the resolved client in actInference. Covers string and object overrides, the no-override default, and parity with stagehand.observe(); fixes #1263 and #1347.

Written for commit a327b17. Summary will update on new commits.

Review in cubic

…e#1263 browserbase#1347)

Verify that ActHandler.act() correctly forwards the per-call `model`
option to resolveLlmClient — matching the pattern already used by
ObserveHandler.observe().  Covers string overrides, object overrides,
the no-override default, and end-to-end propagation to actInference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@changeset-bot

changeset-bot Bot commented Jun 7, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: a327b17

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

This PR is from an external contributor and must be approved by a stagehand team member with write access before CI can run.
Approving the latest commit mirrors it into an internal PR owned by the approver.
If new commits are pushed later, the internal PR stays open but is marked stale until someone approves the latest external commit and refreshes it.

@github-actions github-actions Bot added external-contributor Tracks PRs mirrored from external contributor forks. external-contributor:awaiting-approval Waiting for a stagehand team member to approve the latest external commit. labels Jun 7, 2026

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant Test as Vitest Test
    participant Handler as ActHandler
    participant Utils as actHandlerUtils
    participant Snapshot as A11y Snapshot
    participant LLM as LLMClient
    participant resolveLlm as resolveLlmClient
    participant Inference as actInference

    Note over Test,Inference: Unit test environment (no browser, no real LLM)

    Test->>Handler: new ActHandler(defaultClient, ...)
    Note over Handler: Constructor stores default client, model name, <br/>client options, and resolveLlmClient callback

    alt Test: String model override
        Test->>Handler: handler.act({ model: "anthropic/claude-sonnet-4-20250514" })
        Handler->>resolveLlm: resolveLlmClient("anthropic/claude-sonnet-4-20250514")
        resolveLlm-->>Handler: overrideClient
        Handler->>Handler: Uses overrideClient for inference, not default
    else Test: Object model override
        Test->>Handler: handler.act({ model: { modelName, apiKey } })
        Handler->>resolveLlm: resolveLlmClient({ modelName, apiKey })
        resolveLlm-->>Handler: overrideClient
    else Test: No model override
        Test->>Handler: handler.act({ instruction, page })
        Handler->>resolveLlm: resolveLlmClient(undefined)
        resolveLlm-->>Handler: defaultClient
    end

    Handler->>Utils: waitForDomNetworkQuiet(page)
    Utils-->>Handler: resolved

    Handler->>Snapshot: captureHybridSnapshot(page)
    alt Empty snapshot (tests 1-3)
        Snapshot-->>Handler: { combinedTree: "", combinedXpathMap: {} }
        Handler->>Inference: act({ llmClient: overrideClient|defaultClient, ... })
        Note over Inference: Infer action from snapshot
        Inference-->>Handler: { element: undefined, ... }
        Handler-->>Test: { success: false, message: "No action found" }
    else Non-empty snapshot (test 4)
        Snapshot-->>Handler: { combinedTree: "[0-1] button 'Login'", combinedXpathMap: { "0-1": "/html/body/button" } }
        Handler->>Inference: act({ llmClient: overrideClient, ... })
        Note over Inference: Override client is used, NOT defaultClient
        Inference-->>Handler: { element: found, ... }
        Handler-->>Test: success result
    end

    Note over Test: Verify resolveLlmClient was called with correct argument
    Note over Test: Verify actInference received the resolved override client (test 4)
Loading

Re-trigger cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor:awaiting-approval Waiting for a stagehand team member to approve the latest external commit. external-contributor Tracks PRs mirrored from external contributor forks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Act with Model Option fails

1 participant