-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to zarr 3 and main kerchunk #406
Conversation
I don't know that this PR can be disentangled from the changes we need to get icechunk working with zarr v3... |
I can take the new in-memory icechunk stuff and leave the place where it could slot in. Or this can just be used as a reference point. I'm not attached 😄 |
So IIUC we can either:
|
I'd slightly prefer if we merged this now because it seems to me like this PR really breaks the stalemate that we were in and would unblock other work. specifically for my work, I'd love to merge it into #407 to get the updates already implemented |
I'm happy to merge this now if @jsignell is. We want to block everyone else as little as possible. |
Ab/fix tests
It looks like @abarciauskas-bgse might have fixed all the tests? 👏🏻 |
@@ -42,7 +42,7 @@ def test_zarr_v3_metadata_conformance(tmpdir, vds_with_manifest_arrays: Dataset) | |||
assert isinstance(metadata["fill_value"], (bool, int, float, str, list)) | |||
assert ( | |||
isinstance(metadata["codecs"], list) | |||
and len(metadata["codecs"]) > 1 | |||
and len(metadata["codecs"]) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abarciauskas-bgse did the expectation change so it is correct that there is one codec now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is because of how serializer
was broken out of _v3_codec_pipeline
so these are 2 separate objects now. Today I will investigate if this is how it should be, but my guess is that this will become a part of the refactor to use the zarr-python array class instead of Virtualizarr's zarr.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I took a look at #175 and I agree that this is all going away. So it is probably not worth doing too much work around it.
@@ -198,6 +191,23 @@ def _v3_codec_pipeline(self) -> Any: | |||
|
|||
return codec_pipeline | |||
|
|||
def serializer(self) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function could stand to get reviewed and tested. It looks like it was meant to be just a sketch of what a serializer could look like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this passes the tests we can perhaps merge and deal with this in the follow-up PR that refactors ZArray
away entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly.
I wasn't sure whether or not to add release notes, but feel free to push to this branch or just merge as you see fit. |
We'll just add release notes in the follow-up that actually changes the kerchunk dependency to a released version |
Pulls commits from #378, #405 and #393
UPDATE: Ideas for #376 removed in 42414f0
docs/releases.rst
api.rst
The remaining failures on upstream can be broken into: