Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add image-label convenience functions #178

Merged
merged 13 commits into from
Mar 24, 2022

Conversation

constantinpape
Copy link
Contributor

@constantinpape constantinpape commented Mar 17, 2022

Fixes #176 and #171

This PR:

  • unifies **metadata across write_image, write_multiscale and write_multiscales_metadata
  • introduces new convenience function write_multiscale_image_labels to create image-label data.

For a usage example see https://github.com/ome/ome-ngff-prototypes/blob/update-ome-zarr-py/workflows/spatial-transcriptomics-example/convert_transcriptomics_data_to_ngff.py.

Still TODO:

  • Add write_image_labels (analogue to write_image) that uses write_multiscale_image_labels internally (leave this for a follow-up todo and also clean up write_image ...)
  • Add tests

I will address the TODOs if we decide to go forward with these changes.
Side-note: some confusing / unclear things I noticed in the image-labels spec while working on this.

@constantinpape constantinpape changed the title More convenience Add image-label convenience functions Mar 17, 2022
@codecov
Copy link

codecov bot commented Mar 17, 2022

Codecov Report

Merging #178 (b11207c) into master (b5f36c3) will increase coverage by 0.09%.
The diff coverage is 88.88%.

@@            Coverage Diff             @@
##           master     #178      +/-   ##
==========================================
+ Coverage   83.99%   84.08%   +0.09%     
==========================================
  Files          12       12              
  Lines        1362     1376      +14     
==========================================
+ Hits         1144     1157      +13     
- Misses        218      219       +1     
Impacted Files Coverage Δ
ome_zarr/writer.py 97.19% <88.88%> (-0.98%) ⬇️
ome_zarr/reader.py 85.45% <0.00%> (+0.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b5f36c3...b11207c. Read the comment docs.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That's a great starting point to introducing some functionalities allowing to write label images. A couple of inline design questions mostly about the new label writing.

Thinking of other use case, @will-moore we also have some code in omero-cli-zarr writing label images from OMERO masks. Would the new API be flexible enough to e.g. delegate some of the label creation logic to ome_zarr.writer?

As suggested by the codecov report, I think we want to introduce some unit tests covering the semantics of the new API and testing the metadata is correctly stored.

ome_zarr/writer.py Outdated Show resolved Hide resolved
ome_zarr/writer.py Outdated Show resolved Hide resolved
ome_zarr/writer.py Outdated Show resolved Hide resolved
ome_zarr/writer.py Show resolved Hide resolved
ome_zarr/writer.py Outdated Show resolved Hide resolved
@constantinpape constantinpape requested a review from sbesson March 18, 2022 16:36
@constantinpape
Copy link
Contributor Author

@sbesson I have changed a few things based on your review:

  • rename write_multiscale_image_labels -> write_multiscale_labels
  • change its function signature so that image-label metadata can be passed as a dict
  • implement write_label_metadata
  • small changes to write_multiscales_metadata to prevent over-writing protected keys

I decided not to implement write_labels, which would correspond to write_image, here: the signature of write_image is outdated, e.g. byte_order is not used anymore and some other issues. I would tackle this in a follow-up PR and then implement write_labels once this is done.

What's left to do:

  • add tests
  • fix the pre-commit (I have no idea what that thing is doing, can you maybe check and let me know how to fix it?)

One more question:
do we have a schema for image-labels already that can be used for validation in the tests?

@constantinpape
Copy link
Contributor Author

Follow up for write_image:

  • remove byte_order
  • remove chunks (should be part of storage_options)
  • do this in write_multiscale_labels, to have the same functionality for both

@sbesson
Copy link
Member

sbesson commented Mar 21, 2022

do we have a schema for image-labels already that can be used for validation in the tests?

Not at the moment. The steps to unlock this would be:

@will-moore
Copy link
Member

Looks good. I've started to consume this with omero-cli-zarr in ome/omero-cli-zarr#107, which is working for a single Image. Not tried it for plates yet.

@constantinpape
Copy link
Contributor Author

I have added a test now and fixed the pre-commit. This is good to go from my side. Please let me now if you see the need for any more changes, @sbesson @will-moore.
Once this is in I will follow up with a PR for cleaning up the write_image function signature and implementing write_labels.

do we have a schema for image-labels already that can be used for validation in the tests?

Not at the moment. The steps to unlock this would be:
* add a new JSON schema and examples (migrate the snippet in the spec) under https://github.com/ome/ngff/tree/main/0.4
* make the decisions on the distribution mechanism @will-moore was working this in

Thanks for the pointers, @sbesson. I had a look at the two referenced issues and left comments. It would be great to get this in (and then use it in the tests or directly in the writers), but since the PRs seem to be stalled a bit it's probably better to go ahead and merge this issue first.

@sbesson
Copy link
Member

sbesson commented Mar 22, 2022

Thanks for adding tests and fixing the linting issue. I agree not tie the API addition to this work and focus on getting it released so that we can gain experience consuming. I'll review the tests and the API more thoroughly tomorrow.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions regarding the inferred types of arguments. Otherwise, I think the new API makes sense and allows to support the writing of extended metadata. The new API should already populate the recommended name attribute. We can add additional tests covering more advanced scenarios such as examples compliant with the recommended specification in a follow-up PR.

The default test should cover the basic scenario of creating an image and an associated label image. Could we add another minimal scenario that creates two labels and checks that the attributes are correctly created e.g. the labels attributes register both labels?
Similarly as above, follow-up PRs can add coverage for advanced image-label properties such as colors and properties.

ome_zarr/writer.py Show resolved Hide resolved
ome_zarr/writer.py Show resolved Hide resolved
tests/test_writer.py Show resolved Hide resolved
@constantinpape
Copy link
Contributor Author

constantinpape commented Mar 23, 2022

Thanks for the review, @sbesson.

The new API should already populate the recommended name attribute.

I see two options for this: (i) add name as a required argument to write_image, write_multiscale and write_multiscales_metadata. (ii) check if it's in **metadata in write_multiscales_metadata and if not choose a sensible default. Which option do you prefer? (I would vote for (ii) because it's less invasive, but I don't have a strong opinion here).

Could we add another minimal scenario that creates two labels and checks that the attributes are correctly created e.g. the labels attributes register both labels?

Sounds good, I will add it now.

For some reason I can't comment under the code comments you made (seems to be some github issue), so I will respond to the relevant ones here:

The addition of Union[str is a bit confusing. I know it was flagged by mypy but it would be good to know whether there is a configuration where a single string can be passed of it's a bug or a red herring

Adding str is necessary and we should probably also add int and float. Note that mypy evaluates the individual value passed as kwarg and not the whole kwargs object. See https://adamj.eu/tech/2021/05/11/python-type-hints-args-and-kwargs/ for details.

Would label_metadata: JSONDict = {}, potentially reduce the conditional block in https://github.com/ome/ome-zarr-py/pull/178/files#diff-d51ad4cb28d657d5e09c58f40674b95273611983353ab5d7bb2049bbc6303f7bR545

This is considered bad practice in python: https://docs.quantifiedcode.com/python-anti-patterns/correctness/mutable_default_value_as_argument.html

@constantinpape
Copy link
Contributor Author

I have extended the tests, the Test Coverage CI fails with "Server Error (500)", so most likely unrelated to the changes here.

@joshmoore
Copy link
Member

joshmoore commented Mar 24, 2022

@constantinpape: I found codecov/codecov-action#557 which led to https://github.com/marketplace/actions/codecov

⚠️ Deprecration of v1
On February 1, 2022, this version will be fully sunset and no longer function

https://github.com/marketplace/actions/codecov#migration-from-v1-to-v2

Opened: #180

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me and I am inclined to have this new functionality available sooner rather than later so that we can start consuming it and potentially amending it based on real use cases.
Should we aim to release this immediately as ome-zarr 0.4.0?

@constantinpape
Copy link
Contributor Author

All looks good to me and I am inclined to have this new functionality available sooner rather than later so that we can start consuming it and potentially amending it based on real use cases.
Should we aim to release this immediately as ome-zarr 0.4.0?

Thanks! I agree that we should have this out as soon as possible. But I think it would be a good idea to have a look at fixing and simplifying the write_image API first. So I suggest the following:

  • merge this PR now
  • I make a follow-up PR with API clean-ups (this will be fast, I can do it as soon as this one is merged)
  • You have a look at the follow up and we see if the changes are controversial or not. If they are, we release 0.4.0 without it, if they are not we can still include it before the new release.

@sbesson sbesson merged commit ad2c6eb into ome:master Mar 24, 2022
@constantinpape constantinpape deleted the more-convenience branch March 24, 2022 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

write_multiscales does not expose additional metadata
4 participants