Skip to content

ODC EP 011 Enhance hyperspectral support

Rob Woodcock edited this page Jun 6, 2023 · 12 revisions

Introduction

Use-cases

Access pattern - spectra, time-series, spatial

Spatio-temporal size:

Single temporal observation is already massively larger than standard ODC multi-spectral support (but mangaeable still) = 5m, 300 spectral bands + ancillary information

Temporal rate: revisit every 24 hours

What expectations should we have for datacube.load() performance at scale (with dask) for different use-cases?

Resampling and projection:

  • spectral (including convolving bands to produce "red like landsat")
  • spatial projection - usual requirements though note impact on scale
  • temporal?

Satellite data to use in this EP:

  • EMIT
  • EnMAP
  • Prisma
  • DESIS

Hyperspectral analytics libraries

  • spectral
  • Three CSIRO ones for Water, Minerals and Agriculture

Storage Formats

  • COGS
  • Netcdf4 & HDF5
  • Zarr

Metadata

  • eo3
  • STAC HSI

Proposed HS extensions

Can we reduce repetition in the metadata representation?

Xarray representation

  • wavelength as Dimension
  • wavelength as Data Variables

Dask Task Graph

  • chunking considerations
  • task graph considerations

Can the user parameterise how much work (that could otherwise be parallel) should be done with in Task to assist in managing task count?

Can Task construction be optimized automatically?

Proposed by

  • Robert Woodcock

Dependencies

  • odc-geo
  • eo3 refactor
  • datacube-zarr
  • datacube-core

References

Clone this wiki locally