Skip to content

[test] Add docker-compose to faciliate browser test execution #4441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

luke-mino-altherr
Copy link

@luke-mino-altherr luke-mino-altherr commented Jul 13, 2025

Summary

Improves Playwright browser test infrastructure by adding Docker support and fixing cross-platform compatibility issues.

Problem

The existing browser test setup was complex and had several issues:

  • Platform-specific screenshot issues: Tests generated Linux-specific screenshots that failed on macOS due to font rendering differences
  • Complex local setup: Required manually installing and configuring ComfyUI backend, managing dependencies, and setting up test environment
  • User context errors: Backend threw "Unknown user: default" errors in multi-user mode when tests made API calls without proper user context
  • Inconsistent test environment: Different developers had varying local setups leading to flaky tests

Solution

Docker Infrastructure

  • Dockerfile: Multi-stage build for ComfyUI with built frontend, includes system dependencies for PyTorch
  • docker-compose.test.yml: Orchestrates ComfyUI backend and Playwright test containers with proper networking and health checks
  • Dockerfile.playwright: Dedicated container for running Playwright tests with all browsers pre-installed
  • npm run test:browser:docker: Single command to run entire test suite in containers

Cross-Platform Compatibility

  • Screenshot tolerance: Added 3% pixel difference tolerance for local testing to handle font rendering variations
  • Gitignore patterns: Ignore platform-specific screenshot files (e.g., *-darwin.png) to prevent commit conflicts

User Context Fixes

  • Default user creation: Global setup now creates a "default" user in users.json to fix errors logged in the backend API logs.

Benefits

  • One-command testing: npm run test:browser:docker runs complete test suite
  • Platform consistency: Tests run in identical Linux containers regardless of host OS
  • Easier onboarding: New developers don't need complex local ComfyUI setup
  • Reliable CI/CD: Consistent containerized environment eliminates "works on my machine" issues
  • Faster debugging: HTML reports and screenshots automatically generated and accessible via browser

Usage

# Run all browser tests in Docker
npm run test:browser:docker

# View test results at http://localhost:9323 (if report server starts)

┆Issue is synchronized with this Notion page by Unito

@luke-mino-altherr luke-mino-altherr requested review from a team as code owners July 13, 2025 17:52
@luke-mino-altherr luke-mino-altherr force-pushed the browser-tests-dockerized branch 3 times, most recently from 3fdd3a0 to ed38526 Compare July 13, 2025 18:55
- Add Dockerfile for containerized ComfyUI with frontend build
- Add docker-compose.test.yml for running tests in isolation
- Add Dockerfile.playwright for Playwright test container
- Add npm script for running Docker-based browser tests
- Configure proper networking, health checks, and volume mounts
- Set up HTML report generation and port forwarding
@luke-mino-altherr luke-mino-altherr force-pushed the browser-tests-dockerized branch from ed38526 to 672013a Compare July 13, 2025 19:12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the primary issue with local testing is screenshots due to things like FreeType vs. DirectWrite, isn't this change the only thing that's necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is not something we can merge. 😅

At only 1024x768, this ignores anything that changes less than 23,000 pixels. It would cripple the usefulness of playwright.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am running on a mac. Since the only screenshots committed to the repository are linux based screenshots, i need to execute these tests in a linux based execution environment in order to properly validate there are no visual regressions.

Also as a new developer, i found the initial setup complex -- Docker cleanly abstracts that initial setup and server startup.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is not something we can merge. 😅

At only 1024x768, this ignores anything that changes less than 23,000 pixels. It would cripple the usefulness of playwright.

Ah fair point. We could lower the tolerance? I was most often seeing differences at less than 1%. Also note, this is only in effect during local testing. Github Actions will still check for exact matches

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm super curious about the discrepancies, and wonder if it's just a difference between the image GH actions uses and the docker image? As the display should be virtualised too, we may be able to get pixel-perfect.

I can take a look on mac/windows and see if anything stands out (likely within a day or two).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is not something we can merge. 😅

At only 1024x768, this ignores anything that changes less than 23,000 pixels. It would cripple the usefulness of playwright.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants