Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of mask metadata #55

Merged
merged 8 commits into from
Jul 17, 2020
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 82 additions & 16 deletions spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,57 @@ scripts provided by this repository will support one or more versions of this
file, but they should all be considered internal investigations, not intended
for public re-use.

## Basic layout
## On-disk (or in-cloud) layout

```

. # Root folder, potentially in S3,
├── 123.zarr # with a flat list of images by image ID.
└── 456.zarr #
├── .zattrs # Group level metadata.
├── .zgroup # Each image is a Zarr group with multscale metadata.
└── 0 # Each multiscale level is stored as a separate Zarr array.
├── .zarray #
├── 0.0.0.0.0 # Chunks are stored with the flat directory layout.
└── t.c.z.y.x # All image arrays are 5-dimensional
# with dimension order (t, c, z, y, x).
│ # with a flat list of images by image ID.
├── 123.zarr # One image (id=123) converted to Zarr.
└── 456.zarr # Another image (id=456) converted to Zarr.
├── .zgroup # Each image is a Zarr group, or a folder, of other groups and arrays.
├── .zattrs # Group level attributes are stored in the .zattrs file and include
│ # "multiscales" and "omero" below)
├── 0 # Each multiscale level is stored as a separate Zarr array,
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
│ ... # which is a folder containing chunk files which compose the array.
├── n # The name of the array is arbitrary with the ordering defined by
│ │ # by the "multiscales" metadata, but is often a sequence starting at 0.
│ │
│ ├── .zarray # All image arrays are 5-dimensional
│ │ # with dimension order (t, c, z, y, x).
│ │
│ ├── 0.0.0.0.0 # Chunks are stored with the flat directory layout.
sbesson marked this conversation as resolved.
Show resolved Hide resolved
│ │ ... # Each dotted component of the chunk file represents
│ └── t.c.z.y.x # a "chunk coordinate", where the maximum coordinate
│ # will be `dimension_size / chunk_size`.
└── masks
├── .zgroup # The masks group is a container which holds a list
├── .zattrs # of masks to make the objects easily discoverable,
│ # All masks will be listed in `.zattrs` e.g. `{ "masks": [ "original/0" ] }`
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
└── original # Intermediate folders are permitted but not necessary
│ # and currently contain no extra metadata.
sbesson marked this conversation as resolved.
Show resolved Hide resolved
└── 0
├── .zgroup # Each mask itself is also a multiscale image with the extra key
sbesson marked this conversation as resolved.
Show resolved Hide resolved
├── .zattrs # "color".
sbesson marked this conversation as resolved.
Show resolved Hide resolved
└── t.c.z.y.x # Chunks as above.


```

## "multiscales" metadata
## Metadata

The various `.zattrs` files throughout the above array hierarchy may contain metadata
keys as specified below for discovering certain types of data, especially images.

### "multiscales" metadata

Metadata about the multiple resolution representations of the image can be
found under the "multiscales" key in the group-level metadata.
Expand All @@ -43,7 +78,7 @@ if not datasets:
The subresolutions in each multiscale are ordered from highest-resolution
to lowest.

## "omero" metadata
### "omero" metadata

Information specific to the channels of an image and how to render it
can be found under the "omero" key in the group-level metadata:
Expand Down Expand Up @@ -74,13 +109,44 @@ can be found under the "omero" key in the group-level metadata:
}
```


See https://docs.openmicroscopy.org/omero/5.6.1/developers/Web/WebGateway.html#imgdata
for more information.

### "masks"

The special group "masks" found under an image Zarr contains the key `masks` containing
the paths to mask objects which can be found underneath the group:

```
{
"masks": [
"orphaned/0"
]
}
```

Unlisted groups MAY be masks.

### "color"

The `color` key defines an image that is "labeled", i.e. every unique value in the image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also export color with style=--split. The mask only has 1 values (not labelled) but it does have a color.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... reading it, I'm not sure just having only one label makes something not labeled. The example also only shows one value ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so both --style=labelled and --style=split produce masks that are labelled. Maybe this is a bit less confusing if we ignore the --style=labelled option (since that is the default). If we also consider removing the non-compliant 6d option then all masks are "labelled". Are any masks NOT "labelled"? So I think you can remove that whole sentence "The color key defines an image that is labeled, i.e. every unique value in the image represents a unique, non-overlapping object within the image." since even without any 'color', that statement is still true.

represents a unique, non-overlapping object within the image. The value associated with
the `color` key is another JSON object in which the key is the pixel value of the image and
the value is an RGBA color (4 byte, `0-255` per channel) for representing the object:

```
{
"color": {
"1": 8388736,
sbesson marked this conversation as resolved.
Show resolved Hide resolved
...
```

Copy link
Member

@manics manics Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### "image"
The `image` key is an optional dictionary which contains information on the image the mask is associated with.
If included it must include a key `array` whose value that is either:
- A relative path to a Zarr image array, for example:
```
{
"image": {
"array": "../../0"
}
}
```
- A URL to a Zarr image array (use this if the mask is stored seperately from the image Zarr), for example:
```
{
"image": {
"array": "https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/0"
}
}
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also open discussion at ome/omero-cli-zarr#19 (comment) about the specification of the image key




| Revision | Date | Description |
| ---------- | ------------ | ------------------------------------------ |
| - | 2020-05-07 | Add description of "omero" metadata |
| - | 2020-05-06 | Add info on the ordering of resolutions |
| 0.1 | 2020-04-20 | First version for internal demo |
| 0.1.3 | 2020-07-07 | Add mask metadata |
| 0.1.2 | 2020-05-07 | Add description of "omero" metadata |
| 0.1.1 | 2020-05-06 | Add info on the ordering of resolutions |
| 0.1.0 | 2020-04-20 | First version for internal demo |