Skip to content

Add image class#515

Open
jo-mueller wants to merge 62 commits intoome:masterfrom
jo-mueller:introduce-image-class
Open

Add image class#515
jo-mueller wants to merge 62 commits intoome:masterfrom
jo-mueller:introduce-image-class

Conversation

@jo-mueller
Copy link
Copy Markdown
Collaborator

@jo-mueller jo-mueller commented Jan 21, 2026

Hi @will-moore ,

this is an implementation of my idea for a class-based, user-facing API to writing and reading.

Key features

  • Implemented classes: Implements the NgffImage and NgffMultiscales (similar to the implementation over at ngff-zarr), that serve as primary entrypoints to the writing. NgffImage accepts the data to be written as an array and coerces it to dask internally. It requires the axes (i.e., "tczyx") to be passed at instantiation. It also accepts kwargs for pixel sizes (scale), axes units (axes_units) and the name of the image (name) which are later serialized in the ome-zarr metadata.
    The NgffMultiscales then constructs a pyramid using the already existing methods (_build_pyramid) that were implemented in deprecate scaler class #516.
  • Ome-zarr-models-py: Usage of ome-zarr-models-py for the internal construction of the corresponding Metadata class from there for simpke serialization and de-serialization in the write/read process. Primarily, the coordinate transformation classes and the Multiscales metadata classes are used.
    Importantly, all metadata is internally coerced to ozmp.v05.Multiscales. Only on writing the metadata class is converted to whatever ome-zarr version is desired.
  • Writing: Writing happens through the NgffMultiscales.to_ome_zarr() method. This method makes use of the already existing writing API from Streamline writing #531 (_write_pyramid_to_zarr). It then converts the metadata to the chosen version and uses pydantic's object.model_dump() to create the metadata dictionary. Importantly, the version conversion is only implemented in implement version converters ome-zarr-models/ome-zarr-models-py#398, so this is currently blocked by that.
  • Reading: The implemented NgffMultiscales class also has an attached from_ome_zarr(...) classmethod. The argument is simply the path/group of the ome-zarr image. The function then reads the metadata and the multiscales as dask arrays and returns an instance of NgffMultiscales. The version is automatically detected and again coerced to ozmp.v05.Multiscales internally.
  • Labels: Writing labels can be done by converting them to instances of NgffMultiscales and passing them as a single image or as a dict(str, NgffMultiscales) to the to_ome_zarr writer function. I have yet to implement the same functionality for the reading.

All in all, I think especially the to_ome_zarr and from_ome__zarr methods are super convenient. I have written a follow-up implementation of the scene metadata from 0.6 and making use of the same API there makes a lot of sense. We could think of similar entrypoints to writing HCS layouts.

Further considerations

  • Endorse ome-zarr-models-py here or implement own model after all?
  • Keep the current API? Since the currently existing endpoints now pretty much enter the same functionality - most of the magic is happening in _build_pyramid and write_pyramid_to_zarr - I see no harm in keeping the currently existing API around if it works for people. Maybe we'd have to update the write_image function so that it would also accept an instance of NgffImage, otherwise it may be confusing? I'm not so decided on the reading though.

TODOs:

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 90.75145% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.93%. Comparing base (2f92fce) to head (061271f).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
ome_zarr/classes/image.py 90.18% 26 Missing ⚠️
ome_zarr/utils.py 87.09% 4 Missing ⚠️
ome_zarr/axes.py 80.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #515      +/-   ##
==========================================
+ Coverage   85.19%   85.93%   +0.74%     
==========================================
  Files          14       16       +2     
  Lines        1884     2211     +327     
==========================================
+ Hits         1605     1900     +295     
- Misses        279      311      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@will-moore
Copy link
Copy Markdown
Member

@jo-mueller Thanks for that - This looks like a nice approach to separate the metadata creation and manipulation from the writing to zarr.

Comparing APIs:

  • This PR:
i = Image(data = my_array, dims = ["c", "z", "y", "x"], scale_factors = [2, 4], scale = [1, 0.5, 0.3, 0.3], axes_units = [None, "micrometer", "micrometer", "micrometer"], name="my image"]
i.to_ome_zarr("my_image.zarr", version="0.4")
  • ngff-zarr
image = nz.to_ngff_image(data, dims=['y', 'x'], scale={'y': 1.0, 'x': 1.0}, translation={'y': 0.0, 'x': 0.0})
multiscales = nz.to_multiscales(image, scale_factors=[2,4], chunks=64)
nz.to_ngff_zarr('lightsheet.ome.zarr', multiscales, chunks_per_shard=2, version="0.4")

Various comments, questions. I realise some of this is just not implemented yet... And I haven't tried the code (which might answer some of these)...

  • Spec(ABC) class isn't used/needed
  • to_zarr() should take an existing root OR string
  • No code for reading ome-zarrs yet
  • The current Scaler doesn't downsample in z in most cases (existing issue). Might be best to calculate the scales for each Dataset from the shape of each new_image
  • axes_units isn't used, nor is labels
  • metadata writing to_zarr is missing currently
  • "version" is ignored. How do we handle versions / converting e.g. read v0.4 and write v0.5?
  • No translation added. See code to add this at 492
  • da.to_zarr() needs to specify dimension_names for zarr v3 arrays, e.g. fixed in fix recursion error if __store.fs.protocol is a tuple #511
  • Where do we specify chunks, shards, compressors and other options to pass down to zarr-python?
  • support for omero metadata? Yaozarrs has _omero but it's not exposed very much
  • I think we could drop support for writing v0.1, v0.2 and v0.3, but what about reading them?
  • Yaozarrs is slightly less verbose than ome-zarr-models-py, but generally similar
  • If your data is a 5D array shape = (50, 3, 100, 1024, 1024) and your dims/scale were 4D, e.g. dims = ["c", "z", "y", "x"] would you get any validation errors?

@jo-mueller
Copy link
Copy Markdown
Collaborator Author

@will-moore thanks for the breakdown. I was aware of some of these points (not all) but decided to send it anyway to not go too far in the wrong direction in case there were strong objections to the approach.

Actually, what you wrote is an excellent to-do list :)

@jo-mueller
Copy link
Copy Markdown
Collaborator Author

@will-moore I think this is taking a bit more shape towards how I'd expect it. But before this can continue, I think there is value in discussing first the current duplicity in functionality between these two functions:

  • write_image: calls _build_pyramid under the hood and uses _write_dask_image to serialize the data and metadata to disk
  • write_multiscales: Requires up-front call to _build_pyramid and then serializes data to disk, uses write_multiscales_metadata to write metadata to disc

There's a lot of overlap between these two functions which I think can be condensed so we'd have a single place where

  • metadata is created
  • format parsing is happening

Anyway, I just tried the Image.to_ome_zarr with a large image (3 x 15k x 15k) and I'm getting very decent writing speeds!

@imagesc-bot
Copy link
Copy Markdown

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/separate-tiles-to-ome-zarr/109071/55

@will-moore
Copy link
Copy Markdown
Member

@jo-mueller Could you update the description to reflect where this is heading now?

Are we planning to keep all the existing write methods (any API changes)?

@jo-mueller jo-mueller force-pushed the introduce-image-class branch 2 times, most recently from 2b38492 to c67569d Compare March 5, 2026 09:00
@will-moore
Copy link
Copy Markdown
Member

It would be nice to support writing of "omero" metadata. I think that's covered by ome-zarr-models-py too.
Could be a follow-up PR?

@jo-mueller
Copy link
Copy Markdown
Collaborator Author

jo-mueller commented Mar 6, 2026

@will-moore

Could be a follow-up PR?

Agree.

where this is heading now?
Are we planning to keep all the existing write methods (any API changes)?

THAT is a good question I'm not entirely sure of myself. I guess if we want to go this way further, we would ultimately deprecate write_image and its siblings over the class-based API. As an intermediate step, we could make these functions do something like this under the hood:

def write_image(args, kwargs):
  image = NGffImage(args, kwargs)
  multiscales = NgffMultiscales(image, ....)
  multiscales.to_ome_zarr(....)

which would at least reduce the amount of code to maintain and make sure that everything we do on the class-based API side is covered well by the already existing tests.

What's missing here

The only thing that makes tests fail here currently is this one: ome-zarr-models/ome-zarr-models-py#398. Locally, all tests are passing.

Also, note that this branch has been rebased on #544, so that'll have to go in first, too.

@jo-mueller jo-mueller changed the title WIP: add image class Add image class Mar 6, 2026
@jo-mueller jo-mueller marked this pull request as ready for review March 6, 2026 16:42
@jo-mueller jo-mueller force-pushed the introduce-image-class branch from d317e3b to bcdba78 Compare March 10, 2026 10:53
@jo-mueller
Copy link
Copy Markdown
Collaborator Author

@will-moore to go forward with this one, my idea for a soft transition would be this:

Step 1: Refactor - I am just now trying to see if I can get the existing entrypoints (write_image, etc) to convert the passed data to instances of the introduced classes under the hood. For this, I am adding some of the relevant arguments as explicit keywords (i.e., for name, axes_units) which are passed to the NgffImage class where they are internally validated by ozmp.
This doesn't affect ongoing refactorings regarding sharding, because the central writing logic is still in the same place (_write_pyramid_to_zarr), the classes just wrap around this function.
Step 2: Expose: After #548 is merged, this would be a good opportunity to expose the classes more to the outside. Right now, I am refraining a bit from extensive documentation if there are pending changes on the structure of the docs :)

Comment thread ome_zarr/image.py Outdated
Comment thread ome_zarr/classes/image.py
Comment thread ome_zarr/classes/image.py Outdated
@will-moore
Copy link
Copy Markdown
Member

will-moore commented Apr 8, 2026

from ome_zarr import NgffMultiscales
img_path = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.1/1884807.zarr"
ms = NgffMultiscales.from_ome_zarr(img_path)
out_path = "test_image_class_1884807_05.zarr"
ms.to_ome_zarr(out_path, version="0.5")

This writes an invalid image because the zarr.json has datasets[0].path: "0" but the array is written to path s0.

EDIT: Also the omero metadata is not preserved into the output image

@will-moore
Copy link
Copy Markdown
Member

img_path = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0076A/10501752.zarr"
NgffMultiscales.from_ome_zarr(img_path)
  File "/Users/wmoore/Desktop/ZARR/ome-zarr-py/ome_zarr/classes/image.py", line 547, in from_ome_zarr
    raise ValueError(f"Unsupported OME-Zarr version: {version}")
ValueError: Unsupported OME-Zarr version: 0.4.0

This image has .zattrs with an "unexpected" version:

    "_creator": {
        "name": "omero-zarr",
        "version": "0.4.0"
    },
    "multiscales": [
        { "version": "0.4"...

so the version lookup needs to be a bit more specific

@jo-mueller jo-mueller force-pushed the introduce-image-class branch from 5c7fd90 to 7af8ca5 Compare April 30, 2026 13:43
@jo-mueller
Copy link
Copy Markdown
Collaborator Author

@will-moore I added a bunch more documentation on how handling labels works in the context of class-based image handling, I think this is coming together nicely :) Some precommit issues to clean up, but otherwise I think this is good to go. I can add some more documentation on omero metadata handling but tht doesn't have to be part of this PR (it's big enough as it is)

"# Adding labels to ome-zarrs\n",
"(advanced:labels:adding_labels)=\n",
"\n",
"Unlinke demonstrated in the [basic labels example](basic:labels), in many scenarios, it is nt desirable to write a complete structure consisting of an ome-zarr image and some labels at once.\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos "Unlinke" and "nt"

@will-moore
Copy link
Copy Markdown
Member

Docs error on basic/write_image.html

Screenshot 2026-05-05 at 15 51 52

"source": [
"## Read OME-ZARR images: Legacy API\n",
"\n",
"The code below here demonstrates the legacy API for reading OME-ZARR images.\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a "legacy" API?
I think that write_image() is very handy and shouldn't be deprecated / considered legacy?

All the "Customizing the pyramid" examples all use write_image() and it's not described as legacy there.

"source": [
"from ome_zarr import NgffMultiscales\n",
"\n",
"url = \"https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr\"\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All uk1s3.embassy URLs will need to be updated to livingobjects e.g.
https://livingobjects.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr since the storage is being migrated. uk1s3.embassy will be decommissioned soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants