Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sliceable MetaData Class #455

Open
wants to merge 67 commits into
base: main
Choose a base branch
from
Open

Sliceable MetaData Class #455

wants to merge 67 commits into from

Conversation

DanRyanIrish
Copy link
Member

@DanRyanIrish DanRyanIrish commented Aug 14, 2021

Description

This PR introduces a sliceable Meta class based on dict, which allows metadata to be associated with data axes and the whole class to be sliced using the normal Python slicing API. Metadata not associated with any axes is not touched by the slicing mechanism.

This PR introduces two types of axis-aware metadata: axis-aligned and grid-aligned. Axis-aligned metadata gives one value for each of its associated axes, and is scalar or a string if only associated with one axis. This type of metadata is only changed if the axis/axes within which it is associated is dropped by slicing. In this case, the values corresponding to the dropped axes are also dropped.

Grid-aligned metadata provides a value for each data array element in its associated axis/axes. This metadata cannot be scalar. Since this metadata mirrors at least part of the data array, it is sliced as if it were data, and its shape is kept consistent with its associated data axes. If all its axes are sliced away, the scalar value at the row/column location at which it was sliced is kept, and its axis-awareness is dropped, i.e. it becomes a piece of normal non-axis-aware scalar metadata.

API Summary

>>> import astropy.units as u
>>> from ndcube.meta import Meta

# Define a Meta object associated with data of shape (3, 4, 5)
>>> header = {"salutation": "hello",
              "name": "world",
              "exposure_time": ([2] * 3) * u.s,  # 1-dimension grid-aligned (pixel-wise) metadata
              "instrument_axes": ["slit step", "slit", "dispersion"],  # axis-aligned (scalar/string-per-axis) metadata
              "pixel_efficiency": np.random.ones(4, 5)  # multi-dimension grid-aligned (pixel-wise) metadata}
>>> comments = {"salutation": "This is polite."}
>>> axes = {"exposure_time": 0,
              "instrument_axes": (0, 1, 2),
              "pixel_efficiency": (1, 2)}
>>> meta = Meta(header, axes=axes, comments=comments, data_shape=(3, 4, 5))

# Use slice item that you would apply to the cube on the Meta object.
>>> sliced_meta = meta[0:2, 2]  # Reduce length of 0th axis and drop 1st axis.
>>> sliced_meta.shape  # Shape has been updated.
(2, 5)
>>> sliced_meta["exposure_time"]  # Exposure time now length of new 0th axis
<Quantity [2., 2.] s >
>>> sliced_meta["instrument_axes"]  # instrument axes has dropped slit entry
["slit step", "dispersion"]
>>> sliced_meta.axes["instrument_axes"]
(0, 1)
>>> sliced_meta["pixel_efficiency"]  # pixel efficiency is now 1-D of length 5
array([1., 1., 1., 1., 1.])
>>> sliced_meta.axes["pixel_efficiency"]
(1,)

# Rebin the Meta object
>>> rebinned_meta = meta.rebin((1, 2, 1))
>>> rebinned_meta["exposure_time"]  # grid-aligned metadata associated with non-rebinned axes remains unchanged...
<Quantity [2., 2., 2.] s >
>>> rebinned_meta.axes["exposure_time"]  # ...and still associated with its axis/axes
(0,)
>>> rebinned_meta["pixel_efficiency"]  # grid-aligned metadata associated with rebinned axes remains...
np.random.ones(4, 5)
>>> rebinned_meta.axes["pixel_efficiency"]  # ...but grid-awareness is removed
None
>>> rebinned_meta["instrument_axes"]  # axis-aligned (scalar/string-per-axis) metadata unchanged...
["slit step", "slit", "dispersion"]
>>> rebinned_meta.axes["instrument_axes"]  # ...and still associated with axes as they haven't been dropped
(0, 1, 2)

To Do

  • Implement __delitem__ and remove remove method.
  • Implement setter for NDMeta.data_shape which verifies that it's compatible with pre-existing axis- and grid-aligned metadata.
  • Improve narrative docs

Refactor involved moving comments and axes out of the core values and
putting them into their own dictionaries.  This removed cascading
consequences for self.values(), self.__setitem__, etc.
@pep8speaks
Copy link

pep8speaks commented Aug 14, 2021

Hello @DanRyanIrish! Thanks for updating this PR.

Line 71:70: W504 line break after binary operator

Line 136:38: E721 do not compare types, use 'isinstance()'
Line 118:67: W504 line break after binary operator

Comment last updated at 2021-11-18 14:32:19 UTC

Also put in overwrite error check to Meta.__setitem__ if axis is not
None.  This way the value shape and axes cannot easily be corrupted
accidentally.
@DanRyanIrish DanRyanIrish added this to the 2.1 milestone Aug 14, 2021
Only store entries in the comments and axes dicts that aren't None.
This saves a lot of space and requires the get method be used when
looking for comment and axes values to avoid errors is that key doesn't
have a comment or axis.
@DanRyanIrish DanRyanIrish added discussion needed Potential changes that merit discussion and may or may not be implemented and removed discussion needed Potential changes that merit discussion and may or may not be implemented labels Aug 23, 2021
@DanRyanIrish
Copy link
Member Author

@nabobalis Could you give this PR a review? It should be very similar if not the same as what's in your sunraster PR.

@DanRyanIrish
Copy link
Member Author

@Cadair this PR is ready for review. To pre-empt possible concerns regarding DKIST tools, the introduction of this new class does not require that metadata be sliceable. Instead it allows for it if desired.

In order for the Meta object to be sliceable, Meta.shape must be set via the data_shape kwarg during intialization. This must be set to the shape of the data array with which the metadata object is associated. If Meta.shape is None, the object is not sliceable.

Not all entries in the Meta object have to be sliceable. Only those who have a corresponding entry in the Meta.axes dict will be. This gives the axes with which the metadata is associated and is therefore used to work out how to slice it. If Meta.shape is set, but there are no entries in Meta.axes, then slicing has no effect on the Meta object.

Therefore, as far as I understand, this new class should not threaten your assumption in the DKIST tools that .meta never changes.

ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
nabobalis
nabobalis previously approved these changes Nov 4, 2021
@DanRyanIrish
Copy link
Member Author

@Cadair, this is ready to merge as far as I can see. Do you still want more time to look at it given your DKIST concerns? As I above, I don't think this causes an issues as the new functionalities are optional and don't have to change previous behaviour.

@DanRyanIrish
Copy link
Member Author

@Cadair, as requested, an explainer on the API/behaviour has been added to the PR description.

ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated
"""
def __init__(self, header=None, comments=None, axes=None, data_shape=None):
self.__ndcube_can_slice__ = True
self.original_header = header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favour of meta rather than header

ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
DanRyanIrish and others added 3 commits July 2, 2024 11:19
Co-authored-by: Stuart Mumford <[email protected]>
Also fix bugs introduced by changin header to meta as NDMeta's
internal name for the metadata.
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/ndcube.py Outdated Show resolved Hide resolved
Copy link
Member

@Cadair Cadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs some narrative docs to teach people how and why they should use it.

ndcube/conftest.py Outdated Show resolved Hide resolved
@DanRyanIrish
Copy link
Member Author

@Cadair, except for narrative docs, I think this PR now addresses all but one minor comment of yours. Unless you'd like to look at it again, I will merge it once I've added narrative docs.

Copy link
Member

@Cadair Cadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, a few more comments. I am happy to give this a final review when you are ready.

ndcube/meta.py Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
ndcube/meta.py Outdated Show resolved Hide resolved
Comment on lines +362 to +380
new_shape = new_meta.data_shape
for i, axis_item in enumerate(item):
if isinstance(axis_item, numbers.Integral):
dropped_axes[i] = True
elif isinstance(axis_item, slice):
start = axis_item.start
if start is None:
start = 0
if start < 0:
start = data_shape[i] - start
stop = axis_item.stop
if stop is None:
stop = data_shape[i]
if stop < 0:
stop = data_shape[i] - stop
new_shape[i] = stop - start
else:
raise TypeError("Unrecognized slice type. "
"Must be an int, slice and tuple of the same.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might basically be able to replace all of this (other than the calculations of dropped_axes with this:

new_shape = np.broadcast_to(1, new_meta.data_shape)[item].shape

which is equivalent to np.zeros(shape)[view].shape but without allocating the whole array.

Thanks to @astrofrog for this one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in:

Suggested change
new_shape = new_meta.data_shape
for i, axis_item in enumerate(item):
if isinstance(axis_item, numbers.Integral):
dropped_axes[i] = True
elif isinstance(axis_item, slice):
start = axis_item.start
if start is None:
start = 0
if start < 0:
start = data_shape[i] - start
stop = axis_item.stop
if stop is None:
stop = data_shape[i]
if stop < 0:
stop = data_shape[i] - stop
new_shape[i] = stop - start
else:
raise TypeError("Unrecognized slice type. "
"Must be an int, slice and tuple of the same.")
new_shape = np.broadcast_to(1, new_meta.data_shape)[item].shape
dropped_axes = np.array([True if isinstance(axis_item, numbers.Integral) else False for axis_item in item], dtype=bool)

ndcube/meta.py Outdated Show resolved Hide resolved
Comment on lines +417 to +422
# If value cannot be sliced by fancy slicing, convert it
# it to an array, slice it, and then if necessary, convert
# it back to its original type.
new_value = (np.asanyarray(value)[new_item])
if hasattr(new_value, "__len__"):
new_value = type(value)(new_value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we say we were going to nuke this bit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment about this somewhere after our talk. I realised that without this, NDMeta could not support dropping axes as part of slicing if it contained lists or tuples as axis-aligned metadata. So as ugly as it is, I think it's better to keep it.

ndcube/ndcube.py Outdated Show resolved Hide resolved
Comment on lines +403 to +405
# If meta is axis-aware, make it to have same shape as cube.
if isinstance(self.meta, NDMetaABC):
self.meta.data_shape = self.shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in the setter for .meta? Do we have a setter for .meta?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. We outsource that to NDData.

@DanRyanIrish
Copy link
Member Author

Thanks for the review @Cadair. I've addressed your comments with a couple exceptions where I ask for further clarifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

6 participants