You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, xarray's open_zarr and open_dataset are significantly slower when coordinates are chunked as all coordinate chunks result in a request to S3.
Is it possible to either
create a datastore where coordinates are not chunked
open a dataset which has chunked coordinates but not fetch all the chunks.
Note: I tried decode_coords=False and the same issue results.
a case in which the data are chunked along a dimension but the coordinates are not chunked. This is what we did for the CMIP6-downscaling pyramids to fetch the coordinates with one request but only fetch specific chunks of the data, e.g.,
import zarr
store = zarr.open("s3://carbonplan-cmip6/flow-outputs/results/0.1.9/pyramid/01df7816c64b3999/0/", mode="r")
print(f'tasmin chunks: {store["tasmin"].chunks}')
print(f'time chunks: {store["time"].chunks}')
tasmin chunks: (25, 128, 128)
time chunks: (1020,)
The text was updated successfully, but these errors were encountered:
Right now, xarray's
open_zarr
andopen_dataset
are significantly slower when coordinates are chunked as all coordinate chunks result in a request to S3.Is it possible to either
Note: I tried
decode_coords=False
and the same issue results.Related:
pydata/xarray#6633
pydata/xarray#7368
https://discourse.pangeo.io/t/puzzling-s3-xarray-open-zarr-latency/1074/11
From @maxrjones
The text was updated successfully, but these errors were encountered: