remove axis restrictions #235

d-v-b · 2024-04-30T17:56:55Z

Axes can be N-dimensional, of any type, in any order.

github-actions · 2024-04-30T17:57:07Z

Automated Review URLs

imagesc-bot · 2024-04-30T17:57:18Z

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome-ngff-update-postponing-transforms-previously-v0-5/95617/3

jni · 2024-05-01T11:48:21Z

Thanks for opening the PR @d-v-b! 🙏

but btw it seems you missed a line further down, line 313 in the current file:

Each "datasets" dictionary MUST have the same number of dimensions and MUST NOT have more than 5 dimensions.

d-v-b · 2024-05-02T13:12:38Z

Thanks @jni, I think I got all the references to the 2-5D limit. Please let me know if I missed any.

One consequence of this change is that 1D data can now be stored in OME-NGFF. Personally, I think this is great -- 1D data is real data, and people should be able to store it if they have it.

bogovicj · 2024-05-02T18:02:36Z

This PR should update the schema and examples before being merged.
Heads up - I did a lot of work on this front in this PR: #138
(my commits from Dec 2023), take + edit what you can. I can try to help merge things

Edit: after a little more thought, I'm hopeful the schema changes needed here will be small; but certainly some examples that are currently disallowed that we want to allow would be helpful.

d-v-b · 2024-05-04T12:00:12Z

switching this to a draft while I work on getting the schema documents consistent with the spec. Because manually editing JSON schema documents is tedious and error prone, I am going to generate the schema with some python scripts containing pydantic models. Personally I think these models should be part of this repo, because pydantic is a good tool for modelling JSON schema (much better than doing it manually), but if this is unconvincing I can remove the python files from the final PR.

glyg · 2024-05-16T11:10:10Z

I am going to generate the schema with some python scripts containing pydantic models.

@d-v-b wouldn't a linkML version of your pydantic classes be of more generic use?

---
id: https://ngff.openmicroscopy.org/latest/schemas/image.schema
name: ngff-image
title: OpenMicroscopy New Generation File Formats Image Schema
description: |-
  TODO
version: 0.1
license: ??


prefixes:
  linkml: https://w3id.org/linkml/
  biolink: https://w3id.org/biolink/
  schema: http://schema.org/
  ome: https://www.openmicroscopy.org/Schemas/Documentation/Generated/OME-2016-06/ome.html#
  ORCID: https://orcid.org/
  wiki: https://en.wikipedia.org/wiki/



classes:
  Axis:
    attributes:
      name:
        required: true
      type:
      unit:

  ScaleTransform:
    attributes:
      type:
        # equals_string: "scale" (set this as a rule?)
      scale:
        range: float
        array:
          maximum_number_dimensions: 1
          dimensions:
            - minimum_caridnality: 1

  TranslationTransform:
    attributes:
      type:
        # equals_string: "translation" (set this as a rule?)
      scale:
        range: float
        array:
          maximum_number_dimensions: 1
          dimensions:
            - minimum_caridnality: 1

...

d-v-b · 2024-05-16T11:13:48Z

@glyg perhaps it would, but the goal here is just to generate JSON schema documents, so I'm not sure what generic use we need to accommodate?

glyg · 2024-05-21T07:42:09Z

@d-v-b — I surely don't have a broad enough view of the whole project, so I might very well be mistaken

what generic use we need to accommodate?

I was thinking about consumers of the schema or of a zarr.json. linkML seems to me more usable and language agnostic than custom pydantic classes.

For example a third party library wanting to parse the zarr.json could import these schemas to embed them in its own tooling.

d-v-b · 2024-05-21T08:12:04Z

@glyg in terms of scope, currently this repo contains JSON schema documents that can fetched from github and used for validation. I don't think there's any expectation that software libraries import code artifacts by this repo. That could of course change, but I don't know of efforts in that direction.

I am proposing changes to the spec, and so I need to update the schema documents. Because the current JSON schema documents are manually written, they contain mistakes and are a pain to update after making spec changes.

Since this project is already using python as a dependency, as a quality of life change I am proposing to use pydantic to define data models that serialize to JSON schema, as a way to avoid needing to write the schema documents by hand. I could be wrong, but I suspect writing data models in python and serializing those models to JSON schema will be an easier development experience than writing data models in JSON schema directly. Maybe linkml could also work for this purpose, but I don't know how to use linkml, and I do know how to use pydantic, so for me the choice is simple.

glyg · 2024-05-21T09:26:13Z

Yes I understand your point, I think a linkml version would bring some value but as you said you are the one doing the work 🙂

Thank you for taking the time to answer me

latest/index.bs

d-v-b · 2024-05-21T11:52:37Z

tests are passing, so i think this is ready for review.

jni · 2024-05-21T14:29:48Z

latest/models/image.py

+    name: Optional[str] = None
+    datasets: conlist(Dataset, min_length=1)
+    axes: UniqueList[Axis]
+    coordinateTransformations: Optional[tuple[ScaleTransform] | tuple[ScaleTransform, TranslationTransform]] = None


Is a pydantic tuple equivalent to list in JSON?

JSON arrays are equivalent to python lists, but the spec defines that coordinateTransformations is typed collection with fixed length, so on the python side it's a union of tuples.

latest/tests/image_suite.json

d-v-b · 2024-05-21T16:33:07Z

I added a section to provide some guidance for partial implementations, i.e. software that does not implement the full spec; namely, the spec now suggests that partial implementations which normalize input data to their supported subset of the spec notify users when this is occurring.

jni · 2024-05-22T00:32:59Z

I added a section to provide some guidance for partial implementations,

imho this recommendation is orthogonal to the main purpose of this PR, and it should come in a separate PR. I like it, but it's an extra thing and it's hard enough to merge PRs that are small and self-contained.

joshmoore · 2024-05-22T07:49:58Z

Independently of whether one PR or two, I can certainly see the implementor community wanting clarification in/around RFC-3 about the responsibility placed on them with this restriction dropped.

d-v-b · 2024-05-22T08:08:39Z

imho this recommendation is orthogonal to the main purpose of this PR, and it should come in a separate PR. I like it, but it's an extra thing and it's hard enough to merge PRs that are small and self-contained.

Because this PR is widening the space of ome-ngff data, it seems reasonable to give at least a suggestion for how implementations should handle this change. We cannot expect that all implementations support N-dimensional data. I think the best we can do is suggest that implementations keep users aware of how their data is being cast / coerced / transformed, when that kind of thing is happening. Thus it's very non-orthogonal to this PR.

jni · 2024-05-24T01:34:01Z

It's orthogonal in the sense that partial implementations were a thing before this PR.

joshmoore · 2024-12-03T17:26:55Z

latest/schemas/image.schema

+  ],
+  "title": "NGFF Image",
+  "type": "object",
+  "$schema": "https://json-schema.org/draft/2020-12/schema",


@d-v-b: did this get run through a prettifier? What's the minimal diff?

I generated these schemas with pydantic models (deleted in this commit), What's the particular issue here?

@d-v-b The textual diff is enormous/all-encompassing so it's hard for a reviewer to tell what's changed.

Potential suggestion: make the equivalent pydantic model for the previous spec, generate it (and make sure that it's not semantically different from the main branch), then update it with these changes, then we should be able to see the minimal diff in that commit.

Of course, if there are semantic differences, it might be worth (a) checking whether they are real or bugs, and (b) making a separate PR to update them, if needed.

gotcha, I will see what I can do to clean up the diff

and while we're on the subject, what do we think about renaming the schema files to end with .schema.json instead of .schema, since they are json?

Hmm, we actually weren't on the subject. 😂 My suggestion is to raise that in a different issue/PR.

I already opened one a while ago, but I figured it was worth a shot to gin up some support here :)

diff should be much cleaner now

Thanks, @d-v-b. 👍 for ginned up clean ups post-RFC-3.

d-v-b · 2024-12-04T08:30:50Z

do we still want to require that the axis names are unique?

latest/index.bs

d-v-b · 2024-12-04T09:05:05Z

another question: should the type field be required? (And maybe similarly for unit)? On the implementation side, life is a lot easier with stable types.

d-v-b · 2024-12-04T09:08:45Z

In concrete terms, I would propose that we make type and unit required, but they can be null.

joshmoore · 2024-12-04T10:05:40Z

Those last questions, @d-v-b, are RFC-3 related or more ome-zarr-models-py cleanups?

d-v-b · 2024-12-04T10:11:28Z

Those last questions, @d-v-b, are RFC-3 related or more ome-zarr-models-py cleanups?

I'm trying to answer the question "how should axes work after we remove the existing restrictions", so I think it's a bit of both :)

jni · 2024-12-04T13:46:41Z

I would err on maintaining whatever it is now and updating it in a separate RFC if needed. Unlike RFC-3, making something previously optional now required is backwards-incompatible: a v0.4 file would no longer be valid v0.5 if it omitted the optional types.

d-v-b · 2024-12-04T15:29:00Z

That's fine with me, but why is backwards compatibility a concern here? A v0.4 file will not be valid in 0.5 for different reasons, no?

will-moore · 2024-12-04T16:36:19Z

We do have a lot of OME-Zarr v0.5 data from the ome2024-ngff-challenge that was generated by copying the metadata from v0.4 data without applying any changes. So, a lot of that could be rendered invalid if restrictions were added to the v0.5 spec.

But just to clarify... Are we talking about v0.4 -> v0.5 with this spec change? As I understood it, v0.5 release is soon, and is really just the Zarr v2 -> zarr v3 update, although I guess removing restrictions has very small impact and could be included?

d-v-b · 2024-12-04T16:46:54Z

We do have a lot of OME-Zarr v0.5 data from the ome2024-ngff-challenge that was generated by copying the metadata from v0.4 data without applying any changes. So, a lot of that could be rendered invalid if restrictions were added to the v0.5 spec.

Setting the missing keys to None can be done at the same time as copying the metadata. It's more work than purely copying, but I don't think it's a big burden... provided the changes are an actual benefit (which i think they are).

But just to clarify... Are we talking about v0.4 -> v0.5 with this spec change? As I understood it, v0.5 release is soon, and is really just the Zarr v2 -> zarr v3 update, although I guess removing restrictions has very small impact and could be included?

This is the key question -- 0.5 is basically frozen, so I thought these changes would be in 0.6? But I'm not really sure why we are worried about backwards compatibility?

will-moore · 2024-12-04T17:04:35Z

Looking at the diff, there's a line in latest/index.bs that has ├── .zarray . I would expect that to conflict with the latest spec that should now refer to zarr.json instead.
However, I can't actually find a copy of latest/index.bs as https://github.com/ome/ngff/tree/main/latest shows a very cryptic 0.5 only?

d-v-b · 2024-12-04T17:07:57Z

as of 0.4 both type and unit attributes SHOULD be present in elements of axes. According to the definition of SHOULD used in the spec, everyone should be setting these anyway unless they have carefully weighed their reasons for not doing so. Given that we are making substantial changes to how axes work in this PR, the question is: can we imagine reasons for leaving the type and unit keys unset, versus the simpler solution of value null? If not, then I think we make the simplfying change.

d-v-b · 2024-12-04T17:21:09Z

Looking at the diff, there's a line in latest/index.bs that has ├── .zarray . I would expect that to conflict with the latest spec that should now refer to zarr.json instead. However, I can't actually find a copy of latest/index.bs as https://github.com/ome/ngff/tree/main/latest shows a very cryptic 0.5 only?

That's a good point, I think I should pull in the latest changes to 0.5 and build on that

joshmoore · 2024-12-04T17:34:35Z

+1 for a rebase on top of 0.5 or rather on top of #282 (since there's currently no latest) and keeping it as minimal as possible for the time being, so we can get RFC-3 completed.

feat: remove axis restrictions

4ef3940

jni approved these changes May 1, 2024

View reviewed changes

d-v-b added 2 commits May 2, 2024 16:04

chore: remove additional statements of axis restrictions

a19b3db

chore: remove additional statements of axis restrictions

f9993cb

feat: use pydantic models for making schemas

0507c5e

d-v-b marked this pull request as draft May 4, 2024 11:54

remove outdated tests

d58c814

will-moore reviewed May 21, 2024

View reviewed changes

latest/index.bs Show resolved Hide resolved

d-v-b added 3 commits May 21, 2024 13:23

move models to their own folder, and update omero model

cf45296

remove extra model.py file, update schema and tests

77333b5

remove old image schema

766f935

d-v-b marked this pull request as ready for review May 21, 2024 11:50

jni mentioned this pull request May 21, 2024

RFC-3: more dimensions for thee #239

Merged

jni reviewed May 21, 2024

View reviewed changes

latest/tests/image_suite.json Show resolved Hide resolved

add guidance for partial implementations

800374d

remove pydantic models

aa5c953

joshmoore reviewed Dec 3, 2024

View reviewed changes

d-v-b commented Dec 4, 2024

View reviewed changes

latest/index.bs Outdated Show resolved Hide resolved

d-v-b added 4 commits December 4, 2024 09:51

softer diff on image.schema

11dc303

remove comma

3a5738d

remove another comma, and un-require the type field

6f8f8e2

Update latest/index.bs

c490129

remove axis restrictions #235

Are you sure you want to change the base?

remove axis restrictions #235

Conversation

d-v-b commented Apr 30, 2024

github-actions bot commented Apr 30, 2024 • edited Loading

Automated Review URLs

imagesc-bot commented Apr 30, 2024

jni commented May 1, 2024

d-v-b commented May 2, 2024

bogovicj commented May 2, 2024 • edited Loading

d-v-b commented May 4, 2024

glyg commented May 16, 2024

d-v-b commented May 16, 2024

glyg commented May 21, 2024

d-v-b commented May 21, 2024

glyg commented May 21, 2024

d-v-b commented May 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

d-v-b commented May 21, 2024

jni commented May 22, 2024

joshmoore commented May 22, 2024

d-v-b commented May 22, 2024

jni commented May 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

d-v-b commented Dec 4, 2024

d-v-b commented Dec 4, 2024

d-v-b commented Dec 4, 2024

joshmoore commented Dec 4, 2024

d-v-b commented Dec 4, 2024

jni commented Dec 4, 2024

d-v-b commented Dec 4, 2024

will-moore commented Dec 4, 2024

d-v-b commented Dec 4, 2024

will-moore commented Dec 4, 2024

d-v-b commented Dec 4, 2024

d-v-b commented Dec 4, 2024

joshmoore commented Dec 4, 2024

github-actions bot commented Apr 30, 2024 •

edited

Loading

bogovicj commented May 2, 2024 •

edited

Loading