-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idr0001-graml-sysgro to NGFF #683
Comments
convert first plate for testing...
(at 2pm)... Done about 14:15 (approx 45 mins). |
make bucket etc...
|
Looks like we are getting duplicates of each of the 6 acquisitions for each Plate. |
There are a series of issues we might need to review here. For reference, the source data corresponding to the example above is https://ftp.ebi.ac.uk/pub/databases/IDR/idr0001-graml-sysgro/20151116-verified/JL_120731_S6A/. As can be seen above, there are 6 measurement folders, which are interpreted as plate acquisitions using the IDR Flex reader. Looking at the file listing for each measurement, there are 2 fields of views per well and plate acquisition.
In IDR, each well of each plate acquisition only contains 1 field per view and the fileset only includes the Flex files ending with If the intent is to create a Zarr dataset matching the current IDR representation, I think of two options:
|
Let's try
...export status... 5 out of 96 Wells done in 35 mins. 7 mins per Well is 11 hours per Plate! ... completed at 22:08 - ~11.5 hours. Upload...
Looks good at https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0001/zarr/JL_120731_S6B.ome.zarr/ |
Tried exporting Polygons as labels... but this fails as the Polygons are overlapping:
|
Run mkngff on idr0125-pilot with
as omero-server
http://localhost:1040/webclient/?show=image-1230224 ... Fails with:
|
Error is raised at https://github.com/ome/omero-romio/blob/1d30fafc4e06c5511cfbb24c25a753925ffb2eb4/src/main/java/ome/io/bioformats/BfPixelBuffer.java#L79 |
I suspect the error comes from the underlying reader and is rethrown from the Bio-Formats pixel buffer. Is there more information in the following lines of the stack trace? |
Ah, yes...
|
That line is https://github.com/ome/ZarrReader/blob/main/src/loci/formats/in/ZarrReader.java#L755
The schema states that this is an integer: cc @dgault |
We want to create a copy of the original data without On pilot-zarr1-dev
Conversion took approx 30 mins.
|
This Plate renders OK in webclient 👍 |
Create symlinks...
Count
|
Checking for counts of
|
On pilot-zarr1-dev
|
Try viewing some of the 22 Plates above which have fewer than 96
|
Since none of the Plates with missing .flex files above have successfully been converted by the script running above, screen -S idr0001_test:
Also try it on the original data where the number of
This actually worked and has the correct number of Fields!
Plate has the same number of images (.zattrs) as previous
|
Another Plate that failed with symlinked data (without *2.flex files) - run against original data...
|
5 months laterInvestigation of possible Bio-Formats fix at ome/bioformats#3537 finds that this won't be easy/viable solution. So we need to revive the NGFF conversion work... Summary from first read of history above:
|
If we need to use |
Would be good to convert all plates as above with Plates converted are ~47 GB so we only have space to convert 1 plate on each of
|
Converted data should never end up in the root partition. There is a dedidated
If 5TB is sufficient, let's review and discuss the cleanup the existing |
Thanks, @sbesson. @will-moore, were you looking to convert e.g. all of idr0001 at once rather than, e.g., moving each plate off to S3? |
Need env for omero cli zarr...
NB: webclient is broken e.g. http://localhost:1080/webclient/
Try to use omero-cli-zarr...
We want to use ome/omero-cli-zarr#147 (not merged yet) so we need to checkout that branch etc.
Merged ome/omero-cli-zarr#147 with origin to fix scm version issue... 14:01...
Completed ~ 10:30 pm - 8.5 hours for a Plate... Rename plate, since I didn't use
EDIT... 29th August...
Ahhh! - typo again!!!
Cleanup:
|
Install goofys and mount
All good! |
mkngff...
First plate: Fileset
On pilot-idrngff... Following https://github.com/IDR/mkngff_upgrade_scripts
View image...
Blitz.log
This line of ZarrReader:
|
Edited the plate
Try to delete memo file on
We can now view images for all acquisitions... With new thumbnails generated by saving rendering settings, we see: The next 6 images (B1 - B6) correspond to the 6 Fields from A3 (as seen in vizarr): |
NB: few plates uploaded above:
See #683 (comment) TODO: try mkngff with that plate |
@sbesson pointed out that |
Update OMEZarrReader... on
restarted...
View plate in webclient. 14:08... |
Unfortunately, upgrading ZarrReader didn't fix the Plate/Wells layout issue described above. No change. Issue created at ome/ZarrReader#96 |
As suggested by Seb, let's try to import the zarr plate from scratch from remote s3 at https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0001A/JL_120731_S6A.ome.zarr. Download metadata-only plate to temp location for import...
Import the data...
Ah, I guess Bio-Formats doesn't recognise a plate, without |
@sbesson @joshmoore What's the easiest way to get Bio-Formats to recognise a plate for import?
|
I am unaware
I don't know what's incorrect with your plate. |
on pilot-idrngff as omero-server
ends with:
/tmp/idr0001_20241004.err
|
I am unable to reproduce #683 (comment). Using the same server and the converted data mounted as described in #683 (comment), I have
|
idr0001 has 192 x 96-Well Plates, 6 acquisitions each.
1 Plate converted below is 47 GB.
Approx 4.5 TB in total.
bioformats2raw took ~30mins to convert 1 Plate => approx 4 days in total.
NB: The need to convert multi-acquisition Flex data (idr0001) is because the support for that hasn't been ported from IDR to mainline BioFormats: ome/bioformats#3537
The text was updated successfully, but these errors were encountered: