-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: React: Flaky screenshots (pixel shift) #7548
Comments
I’m facing the same problem with flaky visual regression tests. I’m currently migrating out test suite from backstop.js to playwright. With backstop.js I’ve never had those pixel shift related problems (but several others). So I think this might come from the different image-diffing tool they use. |
@lo1tuma I'm working on an enhanced image snapshot matcher that supports more image comparison algorithms as well as bluring snapshot and test images. I will report back here, once things matured a bit. But currently I can get rid of a lot of flakiness by using SSIM and slight blurring (by 1-2 pixels). |
We migrated the visual regression tests of a middle-complex React app to Playwright and Playwright test runner. A few of our findings regarding test flakiness. Hope this helps people in a similar situation. 1. Disabling of animations and imagesAnimationsIn our case we used material-ui. This enabled us to disable animations in a centralized theme file with a flag, if we are in a test env But you could also inject some custom styles into your page to disable animations: import { Page } from "@playwright/test";
type Params = { page: Page };
export const disableAnimations = async ({ page }: Params): Promise<void> => {
await page.addStyleTag({
content: `*,
*::before,
*::after {
-moz-animation: none !important;
-moz-transition: none !important;
animation: none !important;
caret-color: transparent !important;
transition: none !important;
}`,
});
}; This depends on how your app is designed. But since animations make tests very time-sensitive, they inevitably introduce flakiness. ImagesThe same as above goes for images. They introduce network load. So we cancel all image requests. Instead of cancelling you could also mock all image requests so that they return a placeholder. const cancelImageRequests = (route: Route) =>
route.request().resourceType() === "image" ? route.abort() : route.continue();
await page.route("**/*", cancelImageRequests); 2. Using a docker image that matches the Github action runner OS for reference image creationFont rendering varies greatly between operating systems (as well as OS versions) and browsers. Reference images should be created with a setup as close as possible to your CI. Our Github actions run on Ubuntu 20. Therefore we use a playwright docker image for local testing and snapshot generation. These are our package.json scripts: "scripts": {
"start": "HTTPS=true REACT_APP_ENV=development react-scripts start",
"start:test": "HTTPS=true REACT_APP_ENV=test react-scripts start",
"build:development": "REACT_APP_ENV=development react-scripts build",
"build:test": "REACT_APP_ENV=test react-scripts build",
"build:production": "REACT_APP_ENV=production react-scripts build",
"test:ci": "npx playwright test",
"pretest": "npm run build:test",
"test": "npx playwright test",
"pretest": "npm run build:test",
"test": "docker run -it --rm --ipc=host -v \"${PWD}:/var/app/\" mcr.microsoft.com/playwright:focal /bin/bash -c 'cd /var/app; npx playwright install; npx playwright test'",
}, As you can see we set an environment variable with the current environment (test, ci, production, etc.). Based on this environment we disable for example animations, api endpoints, etc. The script 3. Page / component readinessWith React's virtual DOM we experiences some issues when it comes to knowing when a page is fully rendered and ready. We came up with the following solution: A: Add a rendered class to the body when a screen is fully mountedHow you do this depends heavily on your setup / app structure. In our case we tested single screens/pages instead of single components. // useHasRenderedClass.ts
export const useHasRenderedClass = (className = "rendered") => {
React.useEffect(() => {
if (canUseAnimation) {
// canUseAnimation is a bool flag, if REACT_APP_ENV=test it is false, in production it is true (see point 1 above)
return;
}
document.body.classList.add(className);
return function cleanUp() {
document.body.classList.remove(className);
};
}, [className]);
};
// in your compoent
const SomeScreenYouTest: React.FC = () => {
useHasRenderedClass.ts()
return <h1>My awesome screen</h1>
} B: Await page readiness with a custom functionWith the given setup above we can now use custom playwright export const pageLoaded = () =>
(document as any).fonts.check("12px eb-garamond") && document.body.classList.contains("rendered");
// in your tests
await Promise.all([
page.click(`[data-test-id="nav-element-some-page"]`), // or page.goto("some-url"),
page.waitForFunction(pageLoaded)
]) In the example above we also checked for the existence of a font. We are using Adobe Fonts which introduce an external dependency with a network load. Mocking this made no sense, since without the correct font our layout shifted so much that the whole point of running a visual regression suite was moot. The font check above provided a nice workaround. 4. Slightly blur screenshots before comparing themOur app is very text heavy. We blur our snapshots by 2 pixel before image comparison. This helps a lot with regards to text antialiasing issues. This is something that @playwright/test currently does not support. Therefore we ported jest-image-snapshots to playwright: https://github.com/florianbepunkt/playwright-image-snapshot If someone from the playwright team reads this: It would help tremendously if you could incorporate a blurring option into the image comparison part. Our snapshot settings look like this. The comparison threshold is quite high. This could be finetuned for individual tests import { ImageSnapshotOptions } from "playwright-image-snapshot";
export const BASE_URL = "http://localhost:3000";
export const SNAPSHOT_SETTINGS: ImageSnapshotOptions = {
blur: 2,
comparisonAlgorithm: "pixelmatch",
failureThreshold: 0.2,
failureThresholdType: "percent",
}; 5. RetriesWith all the given above we were able to reduce test flakiness by a lot. Still we experienced some issues where we had to retry a test. So we configured Playwright to retry each test up to 3 times. |
@florianbepunkt thanks for sharing your insights. In our project we already do most of the points you mentioned except (1) and (3) which ware are handling via I’ve also found this list of
An interesting option is Another thing that I’ve noticed, that every time I update the snapshots, almost all images are changing, even though the tests were not failing before (due to the anti-aliasing detection in pixelmatch). This is quite annoying as it makes it very hard to review which changes are intentional and which not. So I was wondering if it would be possible to apply some filters (e.g. anti-aliasing, blurring etc) even before we save the snapshot or compare it with them. I’m not sure if there are any good libraries out there, for doing things like that. But I think this would be the superior solution, since it would also reduce the noise when updating the snapshots. |
The integration tests seem to have nondeterministic failures regarding antialiasing; this allows the test runner to retry 3 times to reduce the number of failed builds. Refs #388, microsoft/playwright#7548
I've spent quite a while trying various options to reduce the number of failures due to antialiasing differences, without success. This takes a bit more of an extreme option to blur the screenshot slightly to try and normalise it. The goal of the test isn't pixel-precision, so this should be ok. Ideally, this will be a feature in Playwright (so the reference image doesn't have to be blurred too). Refs #388, microsoft/playwright#7548
Thanks for the pointers @florianbepunkt. I ran into a lot of trouble when using the provided Docker image. I tried various options without luck, so I've ended up blurring the screenshots manually: PREreview/prereview@d36ae9d. Hopefully, this could become a feature in Playwright to avoid having to blur the reference image (currently it would have to be regenerated to changing the blur level). |
Apologies for the off-topic. |
@andreyfel There are two reasons:
|
@lo1tuma Thanks for your reply! |
Folks, we're exploring how we can improve Visual Regression Testing with Playwright Test. The umbrella bug is #8161, and I'll close this in favor of that one. if you have anything to share regarding VRT, please do so there! |
Your question
I have a lot visual regression tests are quite flaky. Inside our React app I use hook that runs after the current screen to test has been mounted. This hooks adds a
"rendered"
css class to the documents body.Inside playwright I await this class as an indicator that the react screen has loaded.
Still I get very flaky and often falling tests that look like the follwing diffs. Both reference and test images are created on the same machine. It looks to me like the DOM has not been fully rendered yet or some other minimal pixel shift is happening.
Any ideas how this can be improved or what might be the cause of this?
Example 1 Diff:
Example 2 Diff:
The text was updated successfully, but these errors were encountered: