-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve render test compare performance #2223
base: main
Are you sure you want to change the base?
Conversation
Does fellow ASWF project OpenImageIO perform poorly compared to those two? |
That's a good question. I did a small experiment and OIIO is also about 3x slower for performing the compare at least. About 7 seconds so better than PIL + utils (slowest). This with the caveat I'm no OIIO or OpenCV expert :). This can all be updated with different options depending on image size or format (e.g. OIIO can handle EXRs -- which is not the current output format as the RTS avoids dependencies on OpenEXR by default). import OpenImageIO as oiio
import numpy as np
import os
def computeDiff(image1Path, image2Path, imageDiffPath):
try:
# Remove the diff image if it already exists
if os.path.exists(imageDiffPath):
os.remove(imageDiffPath)
# Check if input images exist
if not os.path.exists(image1Path):
print("Image diff input missing: " + image1Path)
return
if not os.path.exists(image2Path):
print("Image diff input missing: " + image2Path)
return
# Load images using OpenImageIO (no reshaping, directly as NumPy arrays)
image1 = oiio.ImageInput.open(image1Path)
image2 = oiio.ImageInput.open(image2Path)
if not image1 or not image2:
print("Failed to load images.")
return
# Read the image data in a single call (as float)
pixels1 = image1.read_image(format=oiio.FLOAT)
pixels2 = image2.read_image(format=oiio.FLOAT)
image1.close()
image2.close()
# Directly access the image specifications
spec1 = image1.spec()
spec2 = image2.spec()
# Ensure images have the same dimensions
if spec1.width != spec2.width or spec1.height != spec2.height:
print("Image dimensions do not match.")
return
# Slice the arrays to ignore the alpha channel (keep only RGB)
pixels1_rgb = pixels1[:, :, :3] # Take the first 3 channels (RGB)
pixels2_rgb = pixels2[:, :, :3] # Take the first 3 channels (RGB)
# Compute the absolute difference (using vectorized NumPy operations)
diff = np.abs(pixels1_rgb - pixels2_rgb)
# Save the difference image
out_spec = oiio.ImageSpec(spec1.width, spec1.height, 3, oiio.FLOAT)
out_image = oiio.ImageOutput.create(imageDiffPath)
if out_image:
out_image.open(imageDiffPath, out_spec)
out_image.write_image(diff)
out_image.close()
else:
print("Failed to write difference image.")
# Compute normalized RMS error
normalized_rms = np.sqrt(np.mean(diff ** 2)) / (3.0 * 1.0)
return normalized_rms
except Exception as e:
print(f"Error during image comparison: {e}")
return |
I'll try a couple experiments on my end to see where the time is going. What's the resolution, format, and number of channels of the images you're comparing? |
@kwokcb @lgritz From my perspective, as long as OpenImageIO generates the same results as PIL, it's purely a bonus if it's also faster, since we'd be replacing an external dependency with an ASWF dependency! Maybe other teams would feel differently, but after spending over an hour to render out reference-quality GLSL and OSL images, the additional 7.5 seconds to generate difference images isn't at the top of my mind. :D |
Are there a couple of these images I can try on my end (without having to build all of MaterialX and run a full testsuite) that I can use as a proxy to investigate performance? I find it hard to believe I can't speed it up further if I have a concrete example. |
Side notes:
|
Here are 2 pairs. The subset has 121 pairs to compare. |
Replace OpenCV with OIIO. Leaving OpenCV implementation for comparison / posterity def computeDiff(image1Path, image2Path, imageDiffPath):
try:
# Remove the diff image if it already exists
if os.path.exists(imageDiffPath):
os.remove(imageDiffPath)
# Check if input images exist
if not os.path.exists(image1Path):
print("Image diff input missing: " + image1Path)
return
if not os.path.exists(image2Path):
print("Image diff input missing: " + image2Path)
return
# Load images using OpenCV
image1 = cv2.imread(image1Path)
image2 = cv2.imread(image2Path)
# Ensure images were loaded successfully
if image1 is None or image2 is None:
print("Failed to load images.")
return
# Convert images to RGB (OpenCV loads images in BGR by default)
image1 = cv2.cvtColor(image1, cv2.COLOR_BGR2RGB)
image2 = cv2.cvtColor(image2, cv2.COLOR_BGR2RGB)
# Compute the absolute difference between the two images
diff = cv2.absdiff(image1, image2)
# Save the difference image
cv2.imwrite(imageDiffPath, cv2.cvtColor(diff, cv2.COLOR_RGB2BGR))
# Compute normalized RMS error
normalized_rms = np.sqrt(np.mean(diff**2)) / (3.0 * 255.0)
return normalized_rms
except Exception as e:
# Clean up and print error message
if os.path.exists(imageDiffPath):
os.remove(imageDiffPath)
print("Failed to create image diff between: " + image1Path + ", " + image2Path)
print(f"Error: {e}") |
@kwokcb have you seen Flip? https://github.com/NVlabs/flip We use this for our image diffs using YardVFX RenderBenchmarks and Rez. |
So you're reading in two PNG images? What are you writing? Also PNG? |
Please don't merge this yet, I will have some important review comments... |
OK, one part of why OIIO is around 3.5x slower is that you're inadvertently asking it to do a lot of extra copies and data conversions. The input files are 8 bit channels, they're staying that way in your OpenCV code (don't know about the original PIL, but I suspect so), but for OIIO, you were saying:
you could instead just let it stay in the native format:
and that saves some time. (Note, because now the values you pull into NumPy will be uint8, you need to change the normalization factor back to If you comment out the image write entirely, just to see how the reads compare, after the above changes, OIIO reading from PNG is within 1.2x the speed of OpenCV. So most of the remaining difference is specifically the writing of PNG, which honestly is not something we've spent much time optimizing in OIIO land, as it is distinctly not a "VFX format" (don't get me started). The first thing I spotted is that when you were writing, you said
which is requesting that you write a float file, but PNG doesn't have float, so it will pick the next best thing, which is uint16, but again that is not apples-to-apples with the OpenCV or PIL code which is writing 8 bit PNG files. So instead I would just advise outputting the diff image in the same data type that you read from the original input:
At this point, with the writing enabled again, OIIO takes 2.5x the time of OpenCV (versus ~3.5x when we started), and we know that most of that difference is the PNG write. Like I said, PNG write isn't the most optimized path we had, so just for fun, I changed the name of the output file to end in ".tif" instead of ".png". Can we write TIFF files (same 8 bit data) faster than PNG? Boy howdy, can we! Outputting .tif instead makes the OIIO version run in 0.94x the time that it takes OpenCV. Is that just an inherent difference between TIFF and PNG, and now I'm hobbling OpenCV by having it write PNG while OIIO writes TIFF? Nope, change the OpenCV version of the script to output a TIFF file and OIIO's overall edge improves even more to 0.92x the time of OpenCV for all the same work. YMMV on the exact timing. I'm using OIIO main (which doesn't currently differ in any important way from the supported 3.0 release) and running on a 2020 era 8 core Intel-based MacBookPro. I was reading your "Queen" sample files, and just doing your comparison subroutine 100 times in a loop. |
But wait, there's more. This may be more complicated than it's worth, but for these particular files, the GLSL ones anyway are 4-channel PNG files. But alpha = 1.0 everywhere. I happen to know that since PNG files are by specification unassociated alpha, and OIIO is automatically converting them to associated alpha on the way in, for the particular case of alpha=1.0 everywhere, that's a no-op, but OIIO's PNG reader has no way to know this ahead of time. I betcha that OpenCV doesn't know anything about associated vs unassociated alpha and isn't doing this extra conversion that OIIO is doing (needlessly, in this case). So... there's a special trick to tell OIIO's PNG reader not to convert to associated alpha:
And with this change, now OIIO does the whole read-diff-write cycle (to a TIFF output) in 0.86x the time it takes OpenCV to do the same. So this full set of changes moves us from an original 3.5x slower to 15% faster. (Or, if we really insist on outputting PNG instead of TIFF, OIIO is still 2.5x slower until we actually crack open OIIO's PNG writing code and see what is wasteful there.) |
One more! Back to PNG, but just checking if maybe we're not using the best compression choice. (I have no idea what OpenCV is using.) Instead of using the default compression, we use "pngfast", which is just a different speed setting on the underlying zlib calls,
that closes the gap to 1.35x even when outputting PNG files. This is just further fooling around, I don't know if switching to this faster compression has any important change in the file sizes or anything else that will matter, and the degree to which it matters might depend on the content of the files. But hey, another set of tradeoffs to consider. |
Thanks for the deep analysis, @lgritz! For this use case, I'm guessing PNG was selected because the source and difference images are being organized into an HTML page for sharing/publication, and PNG is a lossless image format that browsers natively understand. We're not married to that choice, though, and any other lossless image format that browsers can handle would work in its place. I'll pause on integrating this change until @kwokcb has had a chance to take a look at the optimizations that you've suggested, and my hope is that OpenImageIO can be the path forward for this workflow. |
Yes, for web display and wide browser support, PNG makes total sense. It's a little weird as an image comparison format, simply because it's LDR, so the clipping of any values > 1 could mask differences between the rendered images if you have hot highlights or something like that. Anyway, to summarize what I've found, you can get from about 3.5x longer than OpenCV to only 1.35x longer with the combination of:
If you're moving in this direction and 1.35x isn't close enough, I can also look at OIIO's PNG writer and see if there's some fat to squeeze out. |
Proposed Changes
Use OpenCV to to compare performance. It's about 3x or more faster based on local tests
of OSL vs GLSL RTS. ( ~2.3 vs 7.5 seconds ).
OpenCV is C++ based so is generally faster. Also addition utilities could be added as desired
using this library.