Skip to content

fix(e2e): stabilize shell submission and review seeding#20189

Open
Hona wants to merge 8 commits intoanomalyco:devfrom
Hona:fix/e2e-shell-review-flakes
Open

fix(e2e): stabilize shell submission and review seeding#20189
Hona wants to merge 8 commits intoanomalyco:devfrom
Hona:fix/e2e-shell-review-flakes

Conversation

@Hona
Copy link
Copy Markdown
Member

@Hona Hona commented Mar 31, 2026

Summary

  • fill the shell prompt through the contenteditable editor after switching into shell mode so the typed command is present before Enter submits
  • reuse the shared e2e seed helper for review patch setup so apply_patch seeding retries until the expected diff state appears
  • keep the existing review patch prompt constraints intact while probing the specific diff state each test needs

Testing

  • bun run typecheck
  • bun run test:e2e:local -- e2e/prompt/prompt-shell.spec.ts
  • bun run test:e2e:local -- e2e/session/session-review.spec.ts -g "review applies inline comment clicks without horizontal overflow|review file comments submit on click without clipping actions"

Fill the shell prompt through the contenteditable editor after switching to shell mode and reuse the shared seeded patch flow so CI does not fail when the patch tool call is missed.
@Hona Hona requested a review from adamdotdevin as a code owner March 31, 2026 05:21
Copilot AI review requested due to automatic review settings March 31, 2026 05:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce e2e flakiness by making shell command submission deterministic in the contenteditable prompt and by making review patch seeding retry until the expected diff state is observable.

Changes:

  • Switch shell-mode typing to prompt.fill() and assert the command is present before submitting.
  • Reuse the shared seed helper for review patch seeding with per-test diff-state probes.
  • Export the shared seed helper from e2e/actions.ts for reuse.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
packages/app/e2e/session/session-review.spec.ts Refactors patch seeding to use shared seed helper and adds diff probes to wait for specific seeded states.
packages/app/e2e/prompt/prompt-shell.spec.ts Uses fill() on the contenteditable prompt and asserts command text exists before pressing Enter.
packages/app/e2e/actions.ts Exports the shared seed helper so specs can reuse it.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Hona added 6 commits March 31, 2026 15:35
Retry the review patch seed only after the previous prompt has gone idle, catch transient probe failures when checking diff state, and remove redundant diff assertions that are now covered by the patch helper.
Abort stale review seeding runs before retrying and only accept seeded diffs when they contain the exact expected files and marks. This prevents the test from proceeding on hallucinated review files while still retrying real apply_patch misses.
Fail the review seed helper immediately when the build agent returns an assistant error so CI logs show provider status and body details instead of timing out on an empty diff poll.
@Hona
Copy link
Copy Markdown
Member Author

Hona commented Mar 31, 2026

/review

if (!idle) continue

const next = await waitProbe(probe, 10_000)
if (next) return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Consider renaming next to ok or done - the variable holds a boolean result, and next suggests it might be "the next value" rather than "did it succeed". This would make the intent clearer at the call site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants