Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance axes in 0.5.0 #290

Closed
FynnBe opened this issue Nov 26, 2021 · 7 comments
Closed

Enhance axes in 0.5.0 #290

FynnBe opened this issue Nov 26, 2021 · 7 comments
Milestone

Comments

@FynnBe
Copy link
Member

FynnBe commented Nov 26, 2021

Finalizing the 0.4.00.5.0:

# example with bug table 
outputs:
 - name: bug_table
   shape: [1, 5, null, 2]
   # test_tensor: <uri>  # ida for next version: move this mandatory test field here (instead of `test_inputs/outputs`)! 
   axes:  # now called 'axes'
       - role: batch  # one of ["batch", "index", "time", "channel", "z", "y", "x"]
         # all other fields are invalid for "batch"
       - role: time
         label: "space time" # but may not be a list!  (same for zyx and index)
         step: 10  # optional; default 1
         unit: ms  # optional; default: null; free text, but recommended standard from documented list, e.g. https://www.maplesoft.com/support/help/Maple/view.aspx?path=Units/SI
       - role: x
         step: 200  # default: 1
         unit: nm  # 
       - role: index
         label: bug
         # step: <here invalid>  # forbidden for role "index"
         # unit: <here invalid>  # forbidden for role "index"
       - role: channel
         description: "features for each detected bug, including size and age"  # optional; char limit: 128
         # type: "nominal"  # mandatory non-default (guess for old models: "nominal")  => discuss addition of 'types' separately...
         label: ["size", "age"]  # optional; if list, then needs to match shape (and unit if specified); 
         unit: ["meter", "year"]  # optional; if list, then needs to match shape (and label if specified); we have really big and old bugs!
         # step: <here invalid>  # must not be specified if label or unit is a list
 - name: bug_table2
      ... 

Originally posted by @FynnBe in #70 (comment)

@FynnBe
Copy link
Member Author

FynnBe commented Feb 14, 2023

For workflows I implemented this slightly adapted to comply with the NGFF draft (https://ngff.openmicroscopy.org/latest/#axes-md) as @constantinpape suggested:
#478

class Axis:
type: AxisType = missing
name: Union[_Missing, str, List[str]] = missing
description: Union[_Missing, str] = missing
unit: Union[_Missing, SpaceUnit, TimeUnit, str, List[str]] = missing
step: Union[_Missing, int] = missing

Of course it would make a lot of sense if we can use the same axes/input definition for models and workflows

@FynnBe
Copy link
Member Author

FynnBe commented Jun 5, 2023

After recent discussion with @constantinpape it became clear that we also want to support ragged tensors (at least when an 'index' dimension is present).

There is some support in PyTorch for this: torch.nested
And in tensorfow: ragged tensor

However, xarray/numpy do not support ragged arrays. (see pydata/xarray#1482)
I therefore propose the following:

  • every axis may specify a size (see Settle axes discussion for now #70 (comment)) analog to the current tensor.shape as a fixed integer or min and step (I hope step here is not confused with the axis.step corresponding to axis.unit.
  • add name for referencing and to comply wit the NGFF draft (see comment above
  • 3 options to solve resizing axes jointly: (1) add an optional axis.size.step_with: [<other axis name>, <other axis name>] to reference other axes and have them referecne each other; (2) same as (1) but they do not reference each other. Instead we imply bidiretionality from a single reference; (3) we define 'resize groups' separately: e.g. resize_groups: [[<axis name>, <axis_name>], [<axis_name>, <axis_name>, <axis_name>]] next to the axis specification. (I prefer option (1) for being explicit)
  • the default should be to resize all axes that have a variable size jointly: for option (1) and (2) that would imply some "ALL" default value, for option (3) the default depends on the axis definitions (one group with all variable axes).

@esgomezm
Copy link
Contributor

esgomezm commented Jun 5, 2023

Hi @FynnBe

Thanks for the summary. I have two doubts. (1) What do you mean by resizing? (2) how do yo establish the relationship between the axes role list and the actual shape? For example, how do you know that the batch corresponds to the first dimension (value 1), time to the second (value 5) and so on?

@FynnBe
Copy link
Member Author

FynnBe commented Jun 6, 2023

What do you mean by resizing?

Chainging the shape/size for variable shapes/sizes defined by min and step.

how do yo establish the relationship between the axes role list and the actual shape?

The axes field holds a list. To make it easier to map from axes to shape I propose to include size (length of a single dimension) in the axes list instead of a separate shape list.

@Tomaz-Vieira
Copy link
Contributor

Tomaz-Vieira commented Jun 6, 2023

From the discussion in the Ai4Life hackathon, it seems it would be very useful to have each output channel labeled with its precise semantics instead of the arbitrary strings in "labels" (e.g. 'size' and 'age' in the sample code above).

As it stands, a consumer of models (e.g.: ilastik, knime, apeer)can't really tell which models are compatible with the task that the users is trying to accomplish; Even if a model has outputs whose shape and data-type match the expectations of the application, there is no guarantee that the semantics of the output match the expectations of the consuming app. For example, if the application is expecting a model that produces a probability map, where each channel is the probability of a pixel being in one particular class, it could be fooled into thinking that a model that produces an RGB image is actually producing a probability map with 3 classes. Alternatively, an application that can execute arbitrary models might not know how to render the outputs if the semantics are not clear.

Therefore, it seems we need a way to precisely specify the semantics of each channel of the output tensor, which would represent something equivalent to the following, with more types added as needed to the ChannelSemantics Union.

ChannelSemantics = Union[
 IntanceSegmentationInstanceMask,
 InstanceSegmentationObjectId, 
 ProbabilityOfBelongingToAClassAddingUpToOne,
 ProbabilityOfBelongingToOneOfPotentiallyMultipleClasses,
 ProbabiltyOfBeingACenter,
]

class IntanceSegmentationInstanceMask:
    pass

class ProbabilityOfBelongingToAClassAddingUpToOne:
    classification_class_name: str  # e.g. "foreground", "background", "cat", "dog"
    group_name: str # other channels with the same group should add up to one

class ProbabilityOfBelongingToOneOfPotentiallyMultipleClasses:
    classification_class_name: str
    group_name: str # pixels in this group can belong to multiple classes and add up to more than 1.0

class InstanceSegmentationObjectId:
    """A channel labeled with this represents the an object ID as an uint32 (as opposed to e.g. an integer quantity) """
    pass

class ProbabiltyOfBeingACenter:
    pass 

@FynnBe FynnBe added this to the Release 0.5 milestone Jun 9, 2023
@FynnBe
Copy link
Member Author

FynnBe commented Jun 9, 2023

#12 relates to adding semantic labels

@FynnBe
Copy link
Member Author

FynnBe commented Jul 19, 2023

3 options to solve resizing axes jointly: (1) add an optional axis.size.step_with: [, ] to reference other axes and have them referecne each other; (2) same as (1) but they do not reference each other. Instead we imply bidiretionality from a single reference; (3) we define 'resize groups' separately: e.g. resize_groups: [[, <axis_name>], [<axis_name>, <axis_name>, <axis_name>]] next to the axis specification. (I prefer option (1) for being explicit)

looking at this again I thought of another (better :-)) option 4: it's like option 2, but simpler: axis.size.step_with: <one other axis name>. With implied bidirectionality groups can form from referencing only a single other axis. I think conceptually this is easier to understand as one thinks of a "main" axis that another axis steps with when resizing.

@FynnBe FynnBe closed this as completed Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants