Expose band resolution metadata at dataset level #1683

robbibt · 2024-12-09T00:38:41Z

Issue

It would be extremely useful to be able to easily obtain information about the resolution of each satellite band/measurement in a datacube dataset, particularly for products like Sentinel-2 which can contain bands with many resolutions (e.g. 10m, 20m, 60m).

However, this information is currently difficult to obtain. To identify the resolution of a measurement, a user is forced to cross-reference the grid listed against a measurement listed by dss.measurements, against the list of "grids" in the dataset (dss.metadata_doc["grids"]), handling cases where a measurement uses the default grid. This is excessively complex.

Suggested feature

Add an automatically calculated resolution or gsd key to the dictionary returned by dss.measurements. For example, instead of:

>>> dss.measurements

{'oa_fmask': {'grid': 'g20m', 'path': ...},
'nbart_red': {'path': ...}, 
'oa_s2cloudless_prob': {'grid': 'g60m', 'path': ...}}

Do this:

{'oa_fmask': {'resolution': 20, 'grid': 'g20m', 'path': ...},
'nbart_red': {'resolution': 10, 'path': ...},
'oa_s2cloudless_prob': {'resolution': 60, 'grid': 'g60m', 'path': ...}}

The text was updated successfully, but these errors were encountered:

robbibt · 2024-12-09T00:40:57Z

Some example code that might be helpful:

# Load a single dataset
dss = dc.find_datasets(product="ga_s2am_ard_3", limit=1)[0]

# Extract grids used across dataset, and resolution from grid transform
grid_dict = {k:int(v["transform"][0]) for k, v in dss.metadata_doc["grids"].items()}

# For each band, cross-reference to grid dataset, using "default" grid if not available 
band_resolutions = []
for band_name in product_df.name:
    grid_name = dss.measurements[band_name].get("grid", "default")
    band_resolutions.append(grid_dict[grid_name])
band_resolutions

Kirill888 · 2024-12-09T04:27:29Z

Just wanted to add that:

only EO3 datasets have that information
this should really be Product level concern, but data model doesn't require all datasets to have the same resolution for the same band, so best you can get are load hints, but these are not defined per-band.
Python side of metadata classes is lacking in usability
- no html repr for work in the notebook
- awkward constructor API for Dataset class (product as a first parameter, even though it CAN be auto-detected from the dataset metadata, at least in EO3 format)
- lack of sanity checks beyond basic json schema
- lack of reasonable interrogation methods (give me URL for band X for example)

And the thing is - we already HAVE html representation of dataset in the explorer. But to be honest with STAC this becomes less and less useful and less and less likely to be implemented.

robbibt added the enhancement label Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose band resolution metadata at dataset level #1683

Expose band resolution metadata at dataset level #1683

robbibt commented Dec 9, 2024 •

edited

Loading

robbibt commented Dec 9, 2024

Kirill888 commented Dec 9, 2024

Expose band resolution metadata at dataset level #1683

Expose band resolution metadata at dataset level #1683

Comments

robbibt commented Dec 9, 2024 • edited Loading

Issue

Suggested feature

robbibt commented Dec 9, 2024

Kirill888 commented Dec 9, 2024

robbibt commented Dec 9, 2024 •

edited

Loading