Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC-6: Multiscale #285

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 184 additions & 0 deletions rfc/6/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# RFC-6: Multiscale

Turn the `multiscales` array into a single `multiscale` object.


## Status

This proposal is very early. Status: D1

```{list-table} Record
:widths: 8, 20, 20, 20, 15, 10
:header-rows: 1
:stub-columns: 1

* - Role
- Name
- GitHub Handle
- Institution
- Date
- Status
* - Author
- Norman Rzepka
- @normanrz
- scalable minds
- 2024-12-03
-
* - Author
- David Stansby
- @dstansby
- University College London
- 2024-12-03
-
* - Endorser
- Davis Bennett
- @d-v-b
-
- 2024-12-12
-
* - Endorser
- Will Moore
- @will-moore
- OME, University of Dundee
- 2024-12-12
-
* - Endorser
- Lachlan Deakin
- LDeakin
- Australian National University
- 2024-12-17
-
* - Endorser
- Joel Lüthi
- jluethi
- BioVisionCenter, University of Zurich
- 2024-12-18
-
```

## Overview

This RFC proposes to change the `multiscales` array into a single `multiscale` object.

## Background

The current spec of OME-Zarr (version 0.5) defines that the metadata for a multiscale is stored in a `multiscales` array.

However, there seem to only very few OME-Zarr images with mutltiple multiscales in the wild. There is one example from IDR: [4995115.zarr](https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0050A/4995115.zarr)

Additionally, most visualization and processing tools today simply process the first multiscale object in the `multiscales` array, ignoring any subsequent entries. Here is a selection of tools that only load the first multiscale object:

- [neuroglancer](https://github.com/google/neuroglancer/blob/master/src/datasource/zarr/ome.ts#L265-L310)
- [vizarr](https://github.com/hms-dbmi/vizarr/blob/main/src/utils.ts#L88)
- [itk-vtk](https://github.com/Kitware/itk-vtk-viewer/blob/master/src/IO/ZarrMultiscaleSpatialImage.js#L173)
- [OMERO](https://github.com/ome/ZarrReader/issues/44)

The current spec seems to acknowledge that this is to be expected to some degree in the following excerpt:

> If only one multiscale is provided, use it. Otherwise, the user can choose by name, using the first multiscale as a fallback:
>
> ```python
> datasets = []
> for named in multiscales:
> if named["name"] == "3D":
> datasets = [x["path"] for x in named["datasets"]]
> break
> if not datasets:
> # Use the first by default. Or perhaps choose based on chunk size.
> datasets = [x["path"] for x in multiscales[0]["datasets"]]
> ```


This RFC aims to codify what already seems to be common practice by moving from a multiscales array to a single multisclaes object. This will reduce complexity for implementations.

## Proposal

The OME-Zarr metadata in a `zarr.json` file of a multiscale MUST contain a single `multiscale` object. This replaces the current `multiscales` array.

Adapted example from the current spec:
```json
{
"zarr_format": 3,
"node_type": "group",
"attributes": {
"ome": {
"version": "0.5",
"multiscale": {
"name": "example",
"axes": [
{ "name": "t", "type": "time", "unit": "millisecond" },
{ "name": "c", "type": "channel" },
{ "name": "z", "type": "space", "unit": "micrometer" },
{ "name": "y", "type": "space", "unit": "micrometer" },
{ "name": "x", "type": "space", "unit": "micrometer" }
],
"datasets": [
{
"path": "0",
"coordinateTransformations": [
{
// the voxel size for the first scale level (0.5 micrometer)
"type": "scale",
"scale": [1.0, 1.0, 0.5, 0.5, 0.5]
}
]
},
{
"path": "1",
"coordinateTransformations": [
{
// the voxel size for the second scale level (downscaled by a factor of 2 -> 1 micrometer)
"type": "scale",
"scale": [1.0, 1.0, 1.0, 1.0, 1.0]
}
]
}
],
"coordinateTransformations": [
{
// the time unit (0.1 milliseconds), which is the same for each scale level
"type": "scale",
"scale": [0.1, 1.0, 1.0, 1.0, 1.0]
}
]
}
}
}
}
```

For data that needs to have multiple multiscales, it is encouraged to store them in separate OME-Zarr datasets with their own OME-Zarr metadata.


## Requirements

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [IETF RFC 2119](https://tools.ietf.org/html/rfc2119)


## Stakeholders

The main stakeholders for this RFC are OME-Zarr tool developers and existing OME-Zarr image providers. Developers will have to update their implementations to account for the breaking change. Because this change is not backwards compatible, it will require a change to existing OME-Zarr images to make them compatible with this RFC.

### Socialization

* OME-NGFF hackathon Zurich 2024
* [Github issue](https://github.com/ome/ngff/issues/205)

## Implementation

Many visualization and processing tools already expect only a single multiscale.
This proposal will reduce complexity for implementations.

Examples of implementations that only work with a single multiscale:
- [neuroglancer](https://github.com/google/neuroglancer/blob/master/src/datasource/zarr/ome.ts#L265-L310)
- [vizarr](https://github.com/hms-dbmi/vizarr/blob/main/src/utils.ts#L88)
- [itk-vtk](https://github.com/Kitware/itk-vtk-viewer/blob/master/src/IO/ZarrMultiscaleSpatialImage.js#L173)
- [OMERO](https://github.com/ome/ZarrReader/issues/44)

## Drawbacks, risks, alternatives, and unknowns

This is a breaking change with the typical drawbacks of breaking changes.

## Compatibility

This proposal is not backwards compatible and should be released in a new version of OME-Zarr.
Loading