Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarking mode to exrmetrics #1956

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

peterurbanec
Copy link
Contributor

This is designed to help with performance regression testing and has been used to gather data for issue #1915.

@peterhillman
Copy link
Contributor

Thanks for this work! This PR is basically adding a completely different mode to exrmetrics. The main features it adds are CSV output, which is nice for scripted systems, and running multiple iterations, but there are a bunch of limitations and differences between the two modes which give me pause.

As far as I can tell:

  • Part types: exrbench mode only tests the scanline API; exrmetrics tests all four part types (scanlineimage, tiledimage, deepscanline, deeptiled). exrmetrics also reads and writes all mipmap levels in a tiled image.
  • Read metric: exrbench tests read performance by rereading the file it’s just written; exrmetrics uses the original source file. I agree exrbench’s approach is a better default. However, because the OS has just written the file, you will get the data back from a fast cache, not from the low level device itself, so you don't account for the delay in reading data. exrmetrics can be used to measure that delay. Codecs which make smaller files will have less read delay, so it's an important statistic. Maybe the tool should be able measure both the original read and the re-read performance.
  • Part: exrbench only tests part 0; exrmetrics tests any specified part
  • Iterations: exrbench is hard coded to run 10 times; exrmetrics runs once.
  • Pixel modes: exrmetrics only offers "original data type" or "half" not "float" (but not both in the same run); exrbench tests both half and float, but not the original type, so you can’t measure INT pixel type, or mixtures of half and float.
  • compression mode: exrmetrics tests one codec; exrbench tests all.
  • Data size: exrbench doesn’t tell you the file size produced by each codec, though you can read that afterwards yourself
  • data output: exrbench uses CSV mode; exrmetrics doesn't
  • destination file: exrmetrics requires a specified file; exrbench writes a collection of files to the current directory, which may be dangerous (and would cause issues if exrbench was run twice simultaneously with the same current working directory)

Also, looking at exrperf that @palemieux has in #1883, there are a couple of things there which could be integrated here: it has data checking to verify a codec is actually lossless, and can read and write to streams instead of files. (Maybe that mode is enabled when no output filename is specified, so there are no temporary files to worry about)

Perhaps we could refactor so there's a single code path between the two modes, and have basically a bunch of flags that switch individual features from the current behavior of exrmetrics to the exrbench behavior.

So exrbench as it currently is implemented would be enabled with flags like this:
--time write,reread --part 0 --pixeltype half,float --compression all --csv --no-filesize --iterations 10
and exrmetrics would be:
--time read,write --part 0 --pixeltype original --compression original --filesize --iterations 1
There could be a --bench flag which is shorthand for setting the flags required for the current exrbench behavior.

I might also have made the top level loop the pixel type, so main() calls the exrbench function multiple times for each pixel type, which then iterates through each compression type, rather than having both float and half types within exrbench. That reduces the amount of memory required as well as making the code simpler. That would also make it possible to specify a single file for all writing. That said, maybe there should just be 'original' and 'half'. exrmetrics doesn't currently offer a test where it converts halfs to floats because I thought that was misleading: the data is still quantized to half, so file size, and potentially performance, isn't a true representation of "real" floating point data.

One completely independent change I was considering was being able to test all parts of a multipart file. That would also allow exrmetrics to convert a file between compression types without loss of data. It would make sense to do that as part of this refactor. (A --convert flag would be equivalent to --time none --part all --pixeltype original --no-filesize --iterations 1)

@peterurbanec if you are keen to make some changes along those lines then go ahead. Otherwise I'm happy to make a start on that myself when I can.

@peterurbanec peterurbanec force-pushed the openexr-benchmark branch 2 times, most recently from 6f2cc3b to d8c737e Compare January 22, 2025 13:35
This is designed to help with performance regression testing and has been
used to gather data for issue AcademySoftwareFoundation#1915.


Signed-off-by: Peter Urbanec <[email protected]>
@peterurbanec
Copy link
Contributor Author

I force-pushed some updates to address failing CI test cases.

@peterhillman Thank you for the very comprehensive write up. There are a lot of great ideas there. My motivation was to create a quick and simple test case for reproducing a slow down that I reported in issue #1915 . I also have a backport of these changes for OpenEXR 3.2.1, again, motivated by the need to compare performance between releases to provide sufficient data for #1915.

Unfortunately, I do not have the time to do much more development on the benchmarking mode. I was hoping that by adding an initial implementation, I would get the ball rolling on making performance regression testing a part of the OpenEXR development cycle. I always anticipated that people more familiar with OpenEXR workings would be able to enhance the tests and integrate them into CI pipelines.

IMHO, it might be worthwhile to start by merging this simple implementation and make enhancements over time. The current implementation is sufficient for getting a rough idea of OpenEXR performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants