Description
In many applications involving zarr groups, the complete structure of the zarr group is known in advance. That is, the user knows the names + attributes + properties of all the subgroups and sub-arrays. This means a declarative approach to creating "complete" zarr groups would be much more ergonomic than an imperative approach based on repeated calls to create_group
and create_array
.
A dict like this:
{
'': GroupMetadata(attributes={'is_root': True}),
'/b': GroupMetadata(attributes={'is_root': False}),
'/b/c': ArrayV3Metadata(attributes={'is_root': False}, shape=(10,10), ...),
}
gives a complete specification of a zarr group. As an alternative to constructing this group imperatively, we could use a declarative approach and have a single function that takes a dict like the one above and creates a zarr group from it, e.g.
root_group = bikeshed_group_from_dict(store, group_dict) # creates all the members of the group concurrently
root_group.members(depth=None) # (('b', Group), ('b/c', Array))
This would cut down on the boilerplate required to construct zarr groups, and make effective use of our concurrent APIs when creating large hierarchies. At a minimum, this expressive API would make a lot of our tests shorter.
Curious to hear people's thoughts. I'm working on an implementation of this over in #2665. My current plan is to have a from_flat
function in group.py
that creates a group from a flat hierarchy representation. I might also consider adding a Group.from_flat
classmethod, in case we ever want to build out support for subclassing groups to express "typed groups" (e.g., a group that only contains arrays).