Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need axis type? #215

Open
d-v-b opened this issue Sep 8, 2023 · 17 comments
Open

Do we need axis type? #215

d-v-b opened this issue Sep 8, 2023 · 17 comments
Milestone

Comments

@d-v-b
Copy link
Contributor

d-v-b commented Sep 8, 2023

In the latest version of axes metadata, we allow specifying both a type for an axis (e.g., "space", "time", "channel"), as well as a unit. It seems to me that the unit of an axis unambiguously determines its type -- if unit is set to meter, then the axis type must be space, and if the unit is set to second, then the axis type must be time. So what value does the axis type field add?

On the other hand, as noted in #91, the spec allows setting an axis type that is incompatible with the units: {"name": "oops", "unit": "second", "type": "channel"} is valid according to the spec, even though the axis type and the units are inconsistent. Allowing an inconsistency like this is a weakness of the spec that should be remedied. As suggested in #91, we could simply add text stating that the units and the axis type must be consistent, but in that case it's not clear what value the axis type adds, if it is constrained by the units.

Another problem with the axis type field is that its domain is not well defined. If I have a 2D image sampled from phase space of a physical dynamical system, then the axes might be velocity and momentum, in which case the units are simple to express (m/s and m/s ^ 2), but the axes are neither merely space nor merely time but a mixture of space and time. Is "space + time" a valid axis type? Or should it be "space / time"? Image data from relativistic simulations might also run into this issue. If axis type can be any string, it's not clear what applications are expected to do with it.

Based on my experience and the current text of the spec, I cannot see the purpose of the axis type, given that we can express more information with axis unit. I would appreciate other perspectives on this issue -- either we should remove axis type in the latest version (my preferred option), or we should add language clarifying what axis type is for, and why axis unit alone cannot achieve that purpose.

@will-moore
Copy link
Member

In principle this sounds like a good idea (reduce redundancy etc) but just a couple of immediate thoughts...

How would you specify the "channel" axis with units?
If you don't know the units, you can still say something about the axes with types.
Also, if the client reading the data doesn't have an exhaustive listing of units, they may not recognise the unit, but would still be useful to know the type of axis.

"If axes type can be any string, it's not clear what applications are expected to do with it". I'm not sure that getting rid of type helps to address that issue.

The spec says "The [unit] value SHOULD be one of the following strings, which are valid units according to UDUNITS-2". So while that does allow other units, e.g. "m/s ^ 2", I wouldn't expect an application to know what to do with that.

@d-v-b
Copy link
Contributor Author

d-v-b commented Sep 8, 2023

How would you specify the "channel" axis with units?

For light microscopy, you could use SI units for energy or wavelength. But choosing the correct unit here is also problem for the current version of the spec. I think this reveals that the implicit model of a "channel" axis doesn't fit into the same model used for "space" or "time" axes. The intended semantics of a "channel" axis should probably be stated in the spec, because I don't really understand how it is supposed to work.

If you don't know the units, you can still say something about the axes with types.
Also, if the client reading the data doesn't have an exhaustive listing of units, they may not recognise the unit, but would still be useful to know the type of axis.

If this is true, then we should add text to the spec that explains when it is useful to know the type of an axis. I'm not familiar with image processing tools that require this kind of information, so more detail here would be very helpful.

The spec says "The [unit] value SHOULD be one of the following strings, which are valid units according to UDUNITS-2". So while that does allow other units, e.g. "m/s ^ 2", I wouldn't expect an application to know what to do with that.

This is a good point. Because the spec does not require that the unit field be a UDUNITS-2-parseable unit string, we are basically imposing no requirements on applications that consume this information. So it's valid to say that the spec conveys no expectation for any application to know what to do with any unit, besides treat it as a string. I think this is fine, since it defers the problem of interpreting the unit string to the user / application. At the same time, it seems fine to encourage people to use standard unit strings, and UDUNITS-2 is a good source of those. But we should probably not recommend a subset of the UDUNITS-2 strings, and instead just allow anything UDUNITS-2 can parse.

@joshmoore joshmoore added this to the 0.5 milestone Oct 4, 2023
@jni
Copy link
Contributor

jni commented May 27, 2024

For light microscopy, you could use SI units for energy or wavelength.

@d-v-b this exactly contradicts your earlier point, that you can infer the type from the units — wavelength is measured in the same units as spatial extent. There's other examples — such as shear (commonly expressed in 1/s) and frequency (Hz or 1/s).

@d-v-b
Copy link
Contributor Author

d-v-b commented May 27, 2024

@jni I am making two points here.

The first point is that, specifically for spatial and temporal axes, the axis type and the units seem to be tightly coupled. Can you clarify if you think the spec should allow {"type": "space", "unit": "millisecond"}?, if so, my next question would be to explain what "type" means in that context because I don't get it 😄

The second point is that "channel" doesn't fit into the same category as "time" and "space".

  • The spec gives very clear suggestions for "time" and "space" units, but nothing for "channel". You point out that wavelength is measured in distance, and so by units alone it does resemble a "space" axis, but as I noted earlier wavelength can also be measured in energy, or cycles (Hz). One could even record the "unit" for a channel axis as the voltage required to turn the filter wheel, or the angular rotation of the filter wheel itself. The latter examples are silly but still technically workable in the spec. Which one should people choose? The spec provides no guidance here.

  • "channel" has friction with other parts of the spec, notably the coordinate transformations. Suppose I have an imaging dataset where I imaged 3 fluorophores with 488nm, 561nm, and 594nm lasers, resulting in a "channel" axis with 3 elements. Can you suggest a valid "scale" or "translation" parameter for this axis? How should "scale" and "translation" be interpreted here? Would we expect image viewers to handle downsampling along the "channel" axis?

  • Channels are more like categories than positions in spacetime, but the axis metadata doesn't provide any place to convey what the categories are, i.e. there's no place for me to indicate that the 3 elements along the channel axis are named ["488nm", "561nm", "594nm"], or ["gfp", "tdtomato", "alexa-594"].

The spec tries to squeeze spatiotemporal dimensions and categorical dimensions into the same metadata, and I don't think it works. We would probably benefit from more formally distinguishing the two.

@will-moore
Copy link
Member

I agree that "channel" doesn't fit into the same category as "time" and "space", but what is the best way to deal with that?

I can think of various examples of other axes that are like categories, but how should we distinguish them? Should the images from different categories be stored as separate arrays? (e.g. in a Plate each Well & Field is a separate image).

Sometimes the distinction is hard to define.
E.g. FRAP: ["pre-bleach", "after bleaching (t=0)", "1 minute of recovery"]. Is that axis time or category? What if we add a load more timepoints at various (possibly varying) intervals? We certainly need a way to store the names of categories (or timestamp in this case).

If we do allow "category" axes alongside other space and time axes in a N-D array (I think we should), then we need some way to know they are different:

"specifically for spatial and temporal axes, the axis type and the units seem to be tightly coupled"

so you need to know if the type of axis is "space" or "time"?

I'm not sure if the main issue of this discussion is solving the {"type": "space", "unit": "millisecond"} problem, because that doesn't seem to be hard to solve?

@jni
Copy link
Contributor

jni commented May 27, 2024

I'm not sure if the main issue of this discussion is solving the {"type": "space", "unit": "millisecond"} problem, because that doesn't seem to be hard to solve?

☝️

Should the images from different categories be stored as separate arrays?

I agree with the implication that this is an icky restriction.

the axis metadata doesn't provide any place to convey what the categories

indeed. We could have 'type': 'categorical', 'categories': [<list of str>]. That doesn't preclude type though.

You point out that wavelength is measured in distance, and so by units alone it does resemble a "space" axis, but as I noted earlier wavelength can also be measured in energy, or cycles (Hz).

Channel is one example. Stage position is another. I'm sure if we canvassed all bunch of microscopists (not just optical microscopy either), we can come up with more examples where we want a type to distinguish between axes that have the same units.

@d-v-b
Copy link
Contributor Author

d-v-b commented May 27, 2024

I agree that "channel" doesn't fit into the same category as "time" and "space", but what is the best way to deal with that?

I can think of various examples of other axes that are like categories, but how should we distinguish them? Should the images from different categories be stored as separate arrays? (e.g. in a Plate each Well & Field is a separate image).

Sometimes the distinction is hard to define. E.g. FRAP: ["pre-bleach", "after bleaching (t=0)", "1 minute of recovery"]. Is that axis time or category? What if we add a load more timepoints at various (possibly varying) intervals? We certainly need a way to store the names of categories (or timestamp in this case).

I don't know the best way to deal with this, but at a minimum if we go along with the idea that "channel" is actually a categorical axis, then we need a place to write down the categories, and the simplest thing looks approximately like the example @jni gave, where we include an array of values that give meaning to each element along the categorical axis.

Your example of the FRAP experiment is useful. I think converting ["pre-bleach", "after bleaching (t=0)", "1 minute of recovery"] to an irregular sequence of (relative) timepoints is probably easier than making "1 minute of recovery" a category. Ignoring for a second that OME-NGFF doesn't support irregularly sampled data, as an experimenter I would certainly quantify the time information rather than using categories like "pre-bleach", but ultimately I don't think we can be strict about this.

I'm not sure if the main issue of this discussion is solving the {"type": "space", "unit": "millisecond"} problem, because that doesn't seem to be hard to solve?

☝️

@jni I don't know what your this emoji means here 😆 there are only 2 solutions to that problem that I can think of: either allow {"type": "space", "unit": "millisecond"} or disallow it. Which one do you think we should do, and why?

Channel is one example. Stage position is another. I'm sure if we canvassed all bunch of microscopists (not just optical microscopy either), we can come up with more examples where we want a type to distinguish between axes that have the same units.

First, I'm confused about stage position here... wouldn't that be measured in a unit of length? I think I'm not getting it.

As for distinguishing between axes, isn't that what the name field is for? I also welcome examples of datasets / use cases where an image has categorically distinct axes with the same units of measure, but the name field is insufficient to distinguish them. Hearing about some use cases would really help me out. In my own experience, many microscopists just go by axis names or even array indexing order to keep track of which axis is which 🤷‍♂️

@jni
Copy link
Contributor

jni commented May 27, 2024

I don't know what your this emoji means here 😆

it means that whichever decision we take, it is an easy problem to solve.

First, I'm confused about stage position here... wouldn't that be measured in a unit of length?

Yes, but we want to treat it differently from axes of type "space".

As for distinguishing between axes, isn't that what the name field is for?

You could encode it in the name, but I personally find that clunkier. I'd prefer to have the flexibility to name my axes whatever and encode what they are in the type. Do we really want to restrict "space" axes to "xyz"? In the context of transforms, I might want instead to call them "x0", "x1", "x2" and reserve xyz for the output space? 🤷

In short, the name "name" usually refers to "arbitrary ID", so I think it's nice to not attach additional baggage to that. It's part of the motivation for RFC-3 from my perspective.

@jni
Copy link
Contributor

jni commented May 27, 2024

Ignoring for a second that OME-NGFF doesn't support irregularly sampled data, as an experimenter I would certainly quantify the time information rather than using categories like "pre-bleach", but ultimately I don't think we can be strict about this.

Obviously this kind of stuff can come in a later RFC, either tied to transforms (which iirc included displacement/position fields anyway) or separate from them. But you might again want to have a "type" to distinguish the different ordinal/categorical/discrete/positional axes.

@d-v-b
Copy link
Contributor Author

d-v-b commented May 27, 2024

In the stage position example, what would be the axis type?

I don't know what your this emoji means here 😆

it means that whichever decision we take, it is an easy problem to solve.

And which decision do you think we should take? I don't think anyone has actually answered this question yet.

@jni
Copy link
Contributor

jni commented May 27, 2024

In the stage position example, what would be the axis type?

"position" or "stage-position" are two examples.

And which decision do you think we should take? I don't think anyone has actually answered this question yet.

I don't have a strong opinion about it.

@d-v-b
Copy link
Contributor Author

d-v-b commented May 27, 2024

Let me see if I understand the stage position example correctly: you are envisioning that someone acquires a 2D image, moves the stage (lets say in 2 dimensions), and then they acquire another image, and so on, and these images are stored concatenated along an axis of an array, resulting in an array with shape [y, x, s], where y and x are "space" dimensions (the dimensions of the individual images), and s is the "position" dimension.

And are there some actual examples of data acquired like this? How do people use the "type" field for these kinds of datasets today? What scale and translation transformations do they use? I think it would really help to know more about this use case.

@jni
Copy link
Contributor

jni commented May 28, 2024

Let me see if I understand the stage position example correctly: you are envisioning that someone acquires a 2D image, moves the stage (lets say in 2 dimensions), and then they acquire another image, and so on, and these images are stored concatenated along an axis of an array, resulting in an array with shape [y, x, s], where y and x are "space" dimensions (the dimensions of the individual images), and s is the "position" dimension.

I mean, trivially, you can do [sy, sx, y, x] for a rectangular tiled acquisition. Then the step size along sy and sx will be in length units, as well as y and x.

Yes, eventually you are going to stitch this somehow, but you probably want to store this array anyway. Especially if your lab works on stitching algorithms. 😂

And are there some actual examples of data acquired like this?

This figure is from the libczi docs, linked from RFC-3:

image

How do people use the "type" field for these kinds of datasets today?

They don't: the spec forbids it. That's a huge part of the motivation for rfc-3.

What scale and translation transformations do they use?

None currently. For rectangular tiled acquisitions you could have consistent metadata for that, though. For non-rectangular data you can have pointwise positions (transformations spec).

@d-v-b
Copy link
Contributor Author

d-v-b commented May 28, 2024

Thanks, this is super helpful, I think I understand now. As @will-moore noted earlier, I think the HCS layout addresses the same problem, but for HCS datasets instead of CZI montages? Another relevant use case might be ML training datasets, where you have a bunch of 2D images of dogs (for example) padded to the same 2D dimensions and stacked along the last axis, resulting in axes [x, y, dog].

An unwritten assumption in the existing axes metadata (and the entire OME-NGFF image model) is that, with the exception of the "channel" axis, each axis of an array represents a quantity sampled sequentially from a continuous domain. Let's call this quantity a "coordinate". In CF / xarray this coordinate might be represented as actual data that can be manipulated, but in OME-NGFF we use "scale" and "translation" to define a coordinate, which means we are implicitly assuming that coordinates are on a regular grid, which we can shift and stretch as needed. We also assume that the values inside an array are sampled from a continuous domain (because we can downsample those values, i.e., generate new coordinates and resample the array data on them). @jni your proposal to use an array axis to represent a sequence of arbitrary images, and my ML training data "dog axis" idea both go against these assumptions.

@jni correct me if I'm wrong, but I'm pretty sure it would make no sense to downsample an image along "stage coordinate" axes, because you would be blurring array values across a discontinuity. Likewise, it would make no sense to downsample the ML training dataset along the "dog axis", for the same reason.

So I see two options for this kind of thing:
Option 1: we use two types of axes (this is actually the status quo, since "channel" is an instance of the second kind of axis).
The first type of axis represents monotonically increasing coordinate values from a continuous domain, and the image data are continuous w.r.t. this domain. I maintain that, for this kind of axis, the unit of measure is sufficient to distinguish time from space, and so "type" conveys nothing here.

The second type of axis has the following properties:

  • requires varied coordinate information (lists of wavelengths, lists of positions, lists of dog names)
  • should never be downsampled
  • require special code paths for visualization
  • probably needs a "type" field because it's unstructured
  • can basically be used when anyone wants to aggregate a bunch of images that have the same shape and dtype 🤷

In fact, this second type of axis is basically functioning as an arbitrary collection. So... what if we just use a spec for collections of images?

Option 2: we use a higher-order "collection" data structure to represent collections. Personally I think this is a much better solution to this problem, and it is actually what OME-NGFF already does with the HCS layout.

@jni can you explain why your use case couldn't be represented as a collection of separate images? To me that seems much simpler in terms of metadata -- each image just gets its own vanilla spatial metadata -- and it's also how the spec works today (see HCS).

@d-v-b
Copy link
Contributor Author

d-v-b commented May 28, 2024

If we stick with option 1 (the status quo) I still think we can make improvements to the spec. Here are some concrete suggestions, that hopefully don't bleed too much into the transformations discussion

{
"axes": [
  {"name": "x", "type": "dimension", "info": {"unit": "m"}},
  {"name": "ch", "type": "category", "info": {"description": "filter wheel goes brr", "values": ["488", "561", "594"]}}
]
}

"type" is now a discriminator for two structurally distinct types of axes, "dimension" and "category". Because the length of the categorical axis is fixed by this group-level metadata, we would have to expressly forbid downsampling a categorical axis (which is probably fine -- does anyone ever downsample their channel axis? this probably breaks viewers anyway).

An alternative version, which starts to look like a bizarro xarray:

{
"dims": ["x", "ch"],
"coords": {
  "x":  {"type": "dimension", "info": {"unit": "m"}},
  "ch": {"type": "category", "info": {"description": "filter wheel goes brr", "values": ["488", "561", "594"]}}
}

@jni
Copy link
Contributor

jni commented May 28, 2024

I think the opposite of "categor[ical]" is "continuous", not "dimension".

In a variety of scenarios, not least of which is acquisition, having a contiguous array of data is advantageous, sometimes for performance, sometimes for programmatic convenience, and sometimes both. I'm certainly not super-excited about having to do array unstacking/stacking just to save/read an RGB image. So I'm pro-option-1.

I'm not in love with your proposed improvements. This is only for vague reasons that I can't articulate at 1am, but at any rate I think they muddy the waters re RFC-3. I would rather take a smaller step there (stick with "type": "space" for now) and mess with the type keyword (and the overall structure of axes!) in subsequent RFCs.

@d-v-b
Copy link
Contributor Author

d-v-b commented May 28, 2024

In a variety of scenarios, not least of which is acquisition, having a contiguous array of data is advantageous, sometimes for performance, sometimes for programmatic convenience, and sometimes both. I'm certainly not super-excited about having to do array unstacking/stacking just to save/read an RGB image. So I'm pro-option-1.

If you want your data to be contiguous, then a chunked format like zarr might not be the right substrate 😉

Snark aside, I am talking about a file format, not an in-memory representation / API. I think you are conflating the two. Just because two arrays are in separate OME-NGFF zarr groups doesn't mean users (or a nice library) can't concatenate them in memory.

So if we just focus on the file format side, I really think it would be helpful here if you could articulate how your proposed use case is fundamentally different from the HCS use case, or the broader conversation about image collections, because it really looks like the same problem to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants