Skip to content

creating groups from dicts #2685

Open
Open
@d-v-b

Description

@d-v-b

In many applications involving zarr groups, the complete structure of the zarr group is known in advance. That is, the user knows the names + attributes + properties of all the subgroups and sub-arrays. This means a declarative approach to creating "complete" zarr groups would be much more ergonomic than an imperative approach based on repeated calls to create_group and create_array.

A dict like this:

{
'': GroupMetadata(attributes={'is_root': True}),
'/b': GroupMetadata(attributes={'is_root': False}),
'/b/c': ArrayV3Metadata(attributes={'is_root': False}, shape=(10,10), ...),
}

gives a complete specification of a zarr group. As an alternative to constructing this group imperatively, we could use a declarative approach and have a single function that takes a dict like the one above and creates a zarr group from it, e.g.

root_group = bikeshed_group_from_dict(store, group_dict) # creates all the members of the group concurrently
root_group.members(depth=None) # (('b', Group), ('b/c', Array))

This would cut down on the boilerplate required to construct zarr groups, and make effective use of our concurrent APIs when creating large hierarchies. At a minimum, this expressive API would make a lot of our tests shorter.

Curious to hear people's thoughts. I'm working on an implementation of this over in #2665. My current plan is to have a from_flat function in group.py that creates a group from a flat hierarchy representation. I might also consider adding a Group.from_flat classmethod, in case we ever want to build out support for subclassing groups to express "typed groups" (e.g., a group that only contains arrays).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions