-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend precompressed support to read DICOM and write TIFF/OME-TIFF #4190
Extend precompressed support to read DICOM and write TIFF/OME-TIFF #4190
Conversation
Only "plain" TIFF, OME-TIFF not supported yet.
Instead of requiring DICOM output and a pyramid. This makes sure that the smaller resolutions of a pyramid are written tile-wise when writing OME-TIFF.
Different resolutions may have different tile sizes.
Mostly this reuses logic in the basic TIFF writer.
…icom-reader-precompressed
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/bfconverts-precompressed-option-for-ome-tif-output/96519/2 |
…ression type automatically This doesn't yet completely support mixed compression types within the same dataset, but something like: $ bfconvert -noflat -precompressed CMU-1.svs CMU-1.dcm should run without error and convert the pyramid without recompressing.
…icom-reader-precompressed
Always perform tile-wise conversion if precompressed conversion was requested, independent of the format and presence of image pyramid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spirit of #3992 and #4181, used CMU-1.svs
and CMU-1.npdi
as the source files for conversion.
For both files, the bfconvert
utility with this PR included was executed:
- using both OME-TIFF and DICOM as the target format
- using
-compression JPEG
,-compression JPEG-2000
and-precompressed
as the options
Additionally the two DICOM datasets generated with the JPEG and JPEG-2000 compression from CMU-1.ndpi were used as the source of a conversion from DICOM to OME-TIFF using the same three compression options as above.
Source file | Target | Precompressed | Compression | Conversion time | File size |
---|---|---|---|---|---|
CMU-1.svs | DICOM | no | JPEG | 229.057s | 109M |
CMU-1.svs | DICOM | no | JPEG-2000 | 682.503 | 539M |
CMU-1.svs | DICOM | yes | 83.013s | 177M | |
CMU-1.svs | OME-TIFF | no | JPEG | 152.064s | 247M |
CMU-1.svs | OME-TIFF | no | JPEG-2000 | 518.381s | 735M |
CMU-1.svs | OME-TIFF | yes | 27.083s | 176M | |
CMU-1.ndpi | DICOM | no | JPEG | 113.465s | 127M |
CMU-1.ndpi | DICOM | no | JPEG-2000 | 640.609s | 1.2G |
CMU-1.ndpi | DICOM | yes | 16.811s | 191M | |
CMU-1.ndpi | OME-TIFF | no | JPEG | 111.965s | 127M |
CMU-1.ndpi | OME-TIFF | no | JPEG-2000 | 620.038s | 1.3G |
CMU-1.ndpi | OME-TIFF | yes | 6.852s | 191M | |
CMU-1_0_0.dcm | OME-TIFF | no | JPEG | 137.521s | 127M |
CMU-1_0_0.dcm | OME-TIFF | no | JPEG-2000 | 1150.285s | 1.3G |
CMU-1_0_0.dcm (JPEG) | OME-TIFF | yes | 6.654s | 127M | |
CMU-1_0_0.dcm (JPEG-2000) | OME-TIFF | yes | 16.244s | 1.2G |
Overall all the conversion times and sizes are consistent with the previous measurements made during the work on reading precompressed SVS and NDPI files. The speed-up associated with the -precompressed
flag is comparable when writing OME-TIFF and the absolute values are typically smaller.
Trying to validate the binary data in a viewer, all converted images were found to look good except for the two OME-TIFF files created from the DICOM datasets without -precompressed
where the tiles look to be incorrectly ordered
Once the above is addressed, I would re-run the tests above for consistency and I would propose to carry a few additional functional tests including:
- converting to TIFF rather than OME-TIFF (small images)
- converting to other supported formats
- mixing
-precompressed
/-compression
flags - for TIFF/OME-TIFF, testing a few other compression flags
@@ -832,6 +842,12 @@ else if (saveTileWidth > 0 && saveTileHeight > 0) { | |||
} | |||
} | |||
|
|||
if (precompressed && FormatTools.canUsePrecompressedTiles(reader, writer, writer.getSeries(), writer.getResolution())) { | |||
if (getReaderCodecName().startsWith("JPEG")) { | |||
writer.setInterleaved(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to special case here or should we ensure that reader.getInterleaved
returns true appropriately so that
writer.setInterleaved(reader.isInterleaved() && !autoscale);
works correctly above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SVSReader
was the particular issue here; my concern was that swapping the return value of reader.isInterleaved
is a potentially a lot more invasive. When that value changes, what openBytes
returns has to change, which means things like pixels hashes change, etc.
// the conversion must then be performed tile-wise to match the tile sizes, | ||
// even if precompression doesn't end up being possible | ||
if (precompressed) { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely good to see instanceof
checks removed. Are there specific tests that should be carried out here if the target format is neither DICOM not TIFF/OME-TIFF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 29e8c05.
It's probably worth a couple of tests, maybe with fake data, for things like ICS. Keep in mind most writers do not allow tile-wise conversion at all (e.g. https://github.com/ome/bioformats/blob/develop/components/formats-bsd/src/loci/formats/out/APNGWriter.java#L84), and DICOM/TIFF/OME-TIFF are the only ones that would support pyramids.
Currently investigating, but it looks like this issue is reproducible if the same DICOM -> OME-TIFF conversion is done with develop. It's still worth fixing here, but may not have been introduced by this PR. |
I believe 29e8c05 fixes the scrambled tiles issue. As noted above, I could reproduce the problem both with this PR and with develop alone, using something like:
However, specifying the tile sizes explicitly:
no longer showed the problem. With 29e8c05, I would expect either command to result in a correct OME-TIFF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reproducing the tests from #4190 (review) with the last commit, the issue raised when converting from DICOM to OME-TIFF without precompression are now addressed
Timings and file sizes are consistent with the table of the last review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the last review, performed a few additional conversion using different input/output formats (fake, ics, tiff) with different source compression and tile/compression options.
The first set of conversion used a fake file with XY dimensions 2000x2000, 10 focal points and 2 channels. All the conversions using TIFF as the source format used one of the TIFF files generated from the fake files
Source format | Source compression | Target format | Precompressed | Target tile size | Target compression | Conversion time | File size |
---|---|---|---|---|---|---|---|
Fake | Uncompressed | ICS | no | 1.143 | 197M | ||
Fake | Uncompressed | ICS | no | 512x512 | 1.105s | 197M | |
Fake | Uncompressed | ICS | yes | Failed | |||
Fake | Uncompressed | TIFF | no | Uncompressed | 2.558s | 130M | |
Fake | Uncompressed | TIFF | no | zlib | 5.001s | 18M | |
Fake | Uncompressed | TIFF | no | 512x512 | Uncompressed | 3.222s | 81M |
Fake | Uncompressed | TIFF | no | 512x512 | zlib | 3.21s | 448K |
Fake | Uncompressed | TIFF | no | 512x512 | JPEG | 3.889s | 2.5M |
Fake | Uncompressed | TIFF | no | 512x512 | JPEG-2000 | 7.178s | 448K |
TIFF | zlib | ICS | no | 1.952s | 77M | ||
TIFF | zlib | ICS | no | 512x512 | 1.065s | 105M | |
TIFF | zlib | ICS | yes | 1.074s | 105M | ||
TIFF | zlib | TIFF | no | Uncompressed | 2.436s | 77M | |
TIFF | zlib | TIFF | yes | failed | |||
TIFF | zlib | TIFF | no | 512x512 | JPEG | 3.724s | 2.5M |
TIFF | JPEG | TIFF | no | Uncompressed | 2.494s | 77M | |
TIFF | JPEG | TIFF | yes | failed | |||
TIFF | JPEG | TIFF | no | 512x512 | zlib | 3.26s | 546K |
The conversion with a -precompressed
flag failed with
java.lang.UnsupportedOperationException: Reader does not support pre-compressed tile access
which I believe is the expectation.
All generated files were validated visually with all planes/tiles being available for all coordinates.
From my side this is ready for inclusion in the next Bio-Formats release. Given the nature of the changes, this is definitely not amenable for a patch release but could be included in a release incrementing the minor or major version to communicate the backwards-compatible API additions.
In terms of documentation, I can see a few follow-up tasks including:
- update the
bfconvert
utility help to list the formats/codecs supporting pre-compressed tiles like we do for compression - update the user/technical documentation especially https://bio-formats.readthedocs.io/en/latest/users/comlinetools/conversion.html to remove the requirement to specify the compression format and list the additional reader/writers/codecs supporting the
-precompressed
flag
In the mid-term, I wonder whether we'll want to maintain a table of format/codecs that support precompressed reading in reading/writing mode separately.
Merging to fix conflicts with #4164 and unblock the performance work. |
Opening as a draft for now, as #4181 is higher priority. This may need some updates once #4181 is merged.
I would expect basic conversions such as
bfconvert -noflat -precompressed -compression JPEG input.dcm output.ome.tiff
to work, where the input DICOM is part of a whole slide dataset. Tile sizes should be correctly picked up without being specified in thebfconvert
options.A few outstanding things related to this PR and #4181:
-precompressed
does still require the-compression
option to be set correctly; removing that requirement could be done here/cc @fedorov, @dclunie