Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Image size must match baseline" error #49

Closed
dtipson opened this issue Feb 24, 2018 · 16 comments · Fixed by arxiv-vanity/engrafo#291
Closed

"Image size must match baseline" error #49

dtipson opened this issue Feb 24, 2018 · 16 comments · Fixed by arxiv-vanity/engrafo#291

Comments

@dtipson
Copy link

dtipson commented Feb 24, 2018

I'm trying to use this tool to compare a site at two different states using puppeteer's image screenshots. But due to css and content differences between the two different states, the heights of the full pages will differ. This then prevents the diff from running because the full height of the image files won't be of the same dimensions.

There are a bunch of arbitrary ways to fix this, like clipping the shots to the custom needs of any given difference. But it would be really useful if instead image-snapshot had a mode where it would just compare the diff of only the height of the shorter of the two snapshots, or in some other way allow heights to differ (and just count the differing height space as complete 100% pixel diff).

@anescobar1991
Copy link
Member

@dtipson would this help?

@anescobar1991
Copy link
Member

@thomasbertet @sergeybekrin let's discuss here. I definitely see the need now for what you guys are all requesting (now that I understand fully).

@thomasbertet
Copy link
Contributor

I can back @dtipson on this @anescobar1991. This is the exact same thing I would love to have :)

@anescobar1991
Copy link
Member

anescobar1991 commented Mar 2, 2018

So in my mind what I am thinking is that we don't throw on size mismatches as is done today, but instead automatically fail the test and then either:

a). fill in the "extra" space on the smaller of the images and then pass it to pixelmatch (which would produce a diff image that is pretty meaningless but at least includes both the baseline image and the actual image for human eye comparison). If this approach is taken we need to make sure that the image that is then stored as a new baseline snapshot (if --update flag is passed) is the original and not the one we modified in order to match the baseline's dimensions.

or

b). skip pixelmatch comparison and generate a "diff image" that includes the baseline image, an image with a message letting you know that no diff was generated because of size mismatch (in place of where the diff would typically go, and the actual image.

@dtipson @thomasbertet @sergeybekrin What do you guys think?

@thomasbertet
Copy link
Contributor

I would be happy with both solutions.

  • Option "b" seems less work and less bug prone... so it might be sufficient, but requires more human eye comparison. That's OK for me :)

  • Option "a" seems smarter but in a case where the page is different from the top of the image, it won't help much to have a comparison, since everything will be red (as you mentioned) ... but might help a lot if only a bottom part of the image has changed (ie. new content at the bottom). Not sure it worth the effort ...

Really happy that we will have something to handle this, every option is better than what we have now 👍 thanks @anescobar1991

@sbekrin
Copy link
Contributor

sbekrin commented Mar 5, 2018

Thanks @anescobar1991 for bringing this up. I agree with @thomasbertet on that option "a" would work best, we're using fork with almost same solution and it works super smooth for us.

@isilweo
Copy link

isilweo commented Mar 5, 2018

in HTML world i don't see option B as usable. Sometimes the images might differ with 1px that comes from some added padding around html element. Human eye won't catch this and in the end we'll have to check differences using some kind of pixel comparison software. So I would be happy with option A.

@anescobar1991
Copy link
Member

anescobar1991 commented Mar 5, 2018

@isilweo In cases like that wouldn't you just accept the image (via updating the snapshot) if there is no human determinable difference? Not ideal I know, but I ask because with option A, the diff that is generated will not be meaningful: It will always highlight almost the entire image as being different (as pixel by pixel they would be different which still wouldn't be helpful in your use case. What do you think?

For that reason I am leaning towards option B, @sergeybekrin I'd like to get your thoughts on this approach, what do you think?

@sbekrin
Copy link
Contributor

sbekrin commented Mar 5, 2018

@anescobar1991 I would say having at least some diff is really valuable, essentially on CI when it's only way to check what's wrong. And diff wouldn't be always that bad actually since most common use-case is spacing changes. Below is worst case scenario:

image

@isilweo
Copy link

isilweo commented Mar 5, 2018

I agree with you that if there is no human determinable difference it shouldn't a problem... but I have some customers who's testers are comparing PSD designs to what they see in browser on pixel to pixel level and I can't do nothing about it....

Maybe it'll be better if I describe my use case a bit more so you will know how we're using it.
Currently we use React Storybook + jest snapshots from rendered react components. Currently my test suite is almost 170 components, if you add different stories to this we have almost 800 rendered snapshots. Now there are some basic components like Button, if you change Button width by 1 pixel it has big impact on other components so let's say that 100 components will have to be checked to see if everything looks ok. This is quite time consuming. Jest snapshots will show difference and you can catch the change if it was unintentional. But if it was intentional you'll have to check every component that had button rendered to inspect if it is ok.

Now having this handled by image snapshots it could be a lot easier - you will end with 100 images to check but it should be quicker to see if everything is correct.
Below you can see output from pixel match and two screenshots, difference is 1px change in horizontal padding. Not whole image is highlighted and you can clearly see what has changed. Of course new empty area will always be highlighted but it also gives you a hing that size has been changed and you even know if it was height change or width

image

@anescobar1991
Copy link
Member

anescobar1991 commented Mar 5, 2018

@isilweo That only works if the new image is smaller than the baseline image it will not work if the opposite is true. Instead pixelmatch will fail to compare and throw a cryptic error message. That is why I throw my own friendly error message instead right now on size differences.

@sergeybekrin on CI you would still have an output image which would show you the baseline image and the actual image so you could still see the changes, albeit there would not be an image with highlighted differences.

The reason I prefer this is that the diff will be misleading a LOT of the time. Not all cases are like what you and @isilweo are seeing and I would rather make it very explicit that the test has failed because the image sizes are different and display the 2 images side by side to allow for a user to decide whether to update the snapshots or not as opposed to spend time creating a diff image that may or may not even be meaningful to a lot of users.

UPDATE: pixelmatch is not what throws the cryptic error. pngjs does. Either way @isilweo @sergeybekrin although today pixelmatch will create a (misleading) diff, in the next release it will not.

@sbekrin
Copy link
Contributor

sbekrin commented Mar 5, 2018

@anescobar1991 yeah I got your idea, but that feels inconsistent and could be confusing for users who expects diff. Depending on tools and workflow you use, you might not be able to say what's actually changed in specific commit.

Having just baseline and received images is fine, but diff would save us lots of time. I believe machines should help us to identify problems and point at it to let us focus on resolving it instead.

@anescobar1991
Copy link
Member

@sergeybekrin I agree with you about the confusion which is why we'd have an image with a message specifying that no diff was created because image size was different. We would also modify the assertion message to specify that it failed because the image dimensions were different. Hopefully that messaging helps.

Depending on tools and workflow you use, you might not be able to say what's actually changed in specific commit.

What do you mean by that?

And I agree that the diff helps but what about the cases where the diff is not meaningful. If a user is not clear that a diff "looks" funny because there is a size difference then they will be confused.

@anescobar1991
Copy link
Member

@sergeybekrin I spent some time investigating today and you are right. I am going to open #42 back up :)

@anescobar1991
Copy link
Member

Resolved with #42! Will publish in next few days!

@anescobar1991
Copy link
Member

Published as part of v2.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants