Investigate if it is possible to avoid reading all coordinate chunks when opening a dataset with xarray

Right now, xarray's `open_zarr` and `open_dataset` are significantly slower when coordinates are chunked as all coordinate chunks result in a request to S3.

Is it possible to either 
1. create a datastore where coordinates are not chunked
2. open a dataset which has chunked coordinates but not fetch all the chunks.

Note: I tried `decode_coords=False` and the same issue results.

Related:
https://github.com/pydata/xarray/issues/6633
https://github.com/pydata/xarray/pull/7368
https://discourse.pangeo.io/t/puzzling-s3-xarray-open-zarr-latency/1074/11

From @maxrjones

> a case in which the data are chunked along a dimension but the coordinates are not chunked. This is what we did for the CMIP6-downscaling pyramids to fetch the coordinates with one request but only fetch specific chunks of the data, e.g.,

```
import zarr
store = zarr.open("s3://carbonplan-cmip6/flow-outputs/results/0.1.9/pyramid/01df7816c64b3999/0/", mode="r")
print(f'tasmin chunks: {store["tasmin"].chunks}')
print(f'time chunks: {store["time"].chunks}')

tasmin chunks: (25, 128, 128)
time chunks: (1020,)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate if it is possible to avoid reading all coordinate chunks when opening a dataset with xarray #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate if it is possible to avoid reading all coordinate chunks when opening a dataset with xarray #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions