filter on the time dimension with a large dataset

I am interested in the `lntime` accessor that could filter along the time dimension.  So a long-standing problem (at least for me) is how to do filtering along time dimension in a large dataset while the time dimension is chunked?

For example, I have the daily AVISO sea-surface height dataset over several years, which is chunked along time (2D lat-lon data in a single file per day):

```python
<xarray.Dataset> Size: 364GB
Dimensions:         (time: 1096, lat: 1440, nv: 2, lon: 2880)
Coordinates:
  * time            (time) datetime64[ns] 9kB 2015-01-01 ... 2017-12-31
  * lat             (lat) float32 6kB -89.94 -89.81 -89.69 ... 89.69 89.81 89.94
  * lon             (lon) float32 12kB -179.9 -179.8 -179.7 ... 179.8 179.9
  * nv              (nv) int32 8B 0 1
Data variables: (12/14)
    crs             (time) int32 4kB -2147483647 -2147483647 ... -2147483647
    lat_bnds        (time, lat, nv) float32 13MB dask.array<chunksize=(1, 1440, 2), meta=np.ndarray>
    lon_bnds        (time, lon, nv) float32 25MB dask.array<chunksize=(1, 2880, 2), meta=np.ndarray>
    sla             (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    err_sla         (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    ugosa           (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    ...              ...
    err_vgosa       (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    adt             (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    ugos            (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    vgos            (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    flag_ice        (time, lat, lon) float64 36GB dask.array<chunksize=(1, 1440, 2880), meta=np.ndarray>
    tpa_correction  (time) float64 9kB dask.array<chunksize=(1,), meta=np.ndarray>
Attributes: (12/42)
    Conventions:                     CF-1.6
    Metadata_Conventions:            Unidata Dataset Discovery v1.0
    cdm_data_type:                   Grid
    comment:                         Sea Surface Height measured by Altimetry...
    contact:                         servicedesk.cmems@mercator-ocean.eu
    creator_email:                   servicedesk.cmems@mercator-ocean.eu
    ...                              ...
    geospatial_vertical_units:       m
    time_coverage_duration:          P1D
    time_coverage_resolution:        P1D
    time_coverage_end:               2015-01-01T12:00:00Z
    time_coverage_start:             2014-12-31T12:00:00Z
    platform:                        Cryosat-2, OSTM/Jason-2, Haiyang-2A, Altika

```

then how to do filtering along time (for example to extract the signal of a certain bandwidth) on a machine that cannot load the whole dataset into memory?

Just want to know if lenapy has some better way to do this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

filter on the time dimension with a large dataset #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

filter on the time dimension with a large dataset #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions