Skip to content

ODC EP 012 Standardising EO3 metadata format

Paul Haesler edited this page Jul 4, 2023 · 14 revisions

ODC-EP 12 - Standardise the ODC metadata format (eo3)

Overview

The ODC originally supported an extremely open-ended and flexible family of metadata formats.

The "EO3" family of metadata formats was introduced around v1.8.0 to allow improved performance in indexing and loading, although many non-eo3 formats were still supported. Note that EO3 is still extensible in some ways and is more of a family of metadata formats than a single format.

However, the minimum requirements for a metadata format to be "EO3 compatible" have never been formally defined, but were effectively defined by Python code distributed across multiple repositories, most notably datacube-core, and eodatasets.

This EP proposes the adoption of a formal standard for eo3 compatible metadata, and extensible tools for validating metadata against it.

Proposed By

Paul Haesler (@SpacemanPaul)

State

  • In draft
  • Under Discussion
  • In Progress
  • Completed
  • Rejected
  • Deferred

Motivation

Support for non-EO3 datasets in datacube-core adds unnecessary complexity and makes it hard to introduce new features or modify existing features, impeding innovation. It is not clear without manually inspecting schemas and multiple functions across multiple repositories to determine what constitutes a "eo3-compatible" dataset. There are no tools to validate whether a metadata type or product document is capable of working with an eo3-compatible dataset.

The most complete metadata validation toolset currently is eodatasets - which depends on datacube-core and so cannot be used by datacube-core for validating files.

Some validation code is duplicated across repositories or sometimes within a repository -and sometimes the duplicated versions of a function behave inconsistently with each other.

There are undocumented differences between the external metadata documents indexed by the ODC and the metadata documents stored internally within the ODC index.

This all makes future changes or improvements to the ODC index layer and new features requiring new metadata much harder than they need to be.

Proposal

Feedback

Voting

Enhancement Proposal Team

  • Paul Haesler (@SpacemanPaul)

Links

Clone this wiki locally