Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_examples issue #37

Open
regine709 opened this issue Sep 20, 2020 · 2 comments
Open

test_examples issue #37

regine709 opened this issue Sep 20, 2020 · 2 comments

Comments

@regine709
Copy link
Contributor

There are two related issues:

(1) I just realize test_examples.py seems to be not comparing the golden files with the example outputs. For example, if you replace basic_01.png with a different picure but with the same name, it will still pass the tests. I believe there's a typo in the similar_images() function, in line 40 of test_examples.py:
new_image = orig_image.convert('RGB')
should be
new_image = new_image.convert('RGB')

(2) However after correcting this, now I can't pass the test for most existing examples. I tried the first two existing examples, and my image outputs are virtually the same as the golden files - except some minor difference in pixels around the texts, which is only visible when I subtract the two images. Also, some existing examples can actaully pass the test by setting a larger tolerance in similar_images(), like 1.0e-2. But not all examples.

I wonder if my problem in (2) can be replicated? Thanks

@joferkington
Copy link
Owner

For the first issue, you're absolutely right, thanks for the catch! It looks like that's been there for quite awhile.

For the second, this type of test frequently tends to be a little too sensitive. Differences in underlying libraries (more than just matplotlib, even) can lead to large image differences that are irrelevant in practice. (e.g. antialiasing differences) What I meant the image comparison tests to be is a back-check for inadvertent changes when run on the same setup. Mostly, just "do the tests run", really, which it turns out is all I've been checking. In the past, I've been the only developer running tests, so replicating things on my system was good enough, and I knew this would be an issue, but wasn't worried about it.

What I'll change it to check (and what I switched to checking for image based tests in other libraries long ago) is both a threshold and a percentage difference. E.g. trigger a failure if more than 10% of pixels are different by more than 50%. Ideally, those settings are different for every test, though that's impractical in this case.

I should also set up TravisCI or something similar. That wasn't an option when this library was originally developed, but it's trivial to do, these days. That will partially alleviate the library-related inconsistencies in image tests, as it's easy to pin libraries, and running the tests centrally means that minor differences between individual setups are less of a concern.

I'll try to address this tonight (or in the next week, anyway). Thanks for noticing this!!

@regine709
Copy link
Contributor Author

Thanks for your detailed explanations re: the second issue. I'm relieved it's not caused by some improper settings on my machine. Also thanks for planning to address these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants