RFC: A common construct to replace raster:bands
and eo:bands
#1213
Replies: 9 comments 13 replies
-
It would be available "everywhere". Common metadata fields are applicable in Catalogs, Collections, Item Properties, Links, and all kinds of Assets. It could even be re-used in other extensions. The "summary" behavior of |
Beta Was this translation helpful? Give feedback.
-
Thank you @m-mohr for this very detailed proposal.
|
Beta Was this translation helpful? Give feedback.
-
A very good and detailed proposal! One small note that might be related: in the |
Beta Was this translation helpful? Give feedback.
-
I liked a lot the very detailed proposal @m-mohr Thanks! I though have one question since this point was not too clear to me. "assets": {
"thumbnail": {
...
},
...
"B01": {
"title": "Band 1 (coastal) BOA reflectance",
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data"
],
"gsd": 60,
"eo:bands": [
{
"name": "B01",
"common_name": "coastal",
"center_wavelength": 0.4439,
"full_width_half_max": 0.027
}
],
"href": "https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/33/S/VB/2021/2/S2B_33SVB_20210221_0_L2A/B01.tif",
"proj:shape": [
1830,
1830
],
"proj:transform": [
60,
0,
399960,
0,
-60,
4200000,
0,
0,
1
],
"raster:bands": [
{
"data_type": "uint16",
"spatial_resolution": 60,
"nodata": 0,
"statistics": {
"minimum": 1,
"maximum": 20567,
"mean": 2339.4759595597,
"stddev": 3026.6973619954,
"valid_percent": 99.83
},
"unit": "W/m²/sr/μm"
}
]
}
} How would this become? Like the following? "assets": {
"thumbnail": {
...
},
...
"B01": {
"title": "Band 1 (coastal) BOA reflectance",
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data"
],
"gsd": 60,
"subassets": [
{
"name": "B01",
"eo:common_name": "coastal",
"eo:center_wavelength": 0.4439,
"eo:full_width_half_max": 0.027,
"data_type": "uint16",
"spatial_resolution": 60,
"nodata": 0,
"statistics": {
"minimum": 1,
"maximum": 20567,
"mean": 2339.4759595597,
"stddev": 3026.6973619954,
"valid_percent": 99.83
},
"unit": "W/m²/sr/μm"
}
],
"href": "https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/33/S/VB/2021/2/S2B_33SVB_20210221_0_L2A/B01.tif",
"proj:shape": [
1830,
1830
],
"proj:transform": [
60,
0,
399960,
0,
-60,
4200000,
0,
0,
1
]
]
}
} If what I understood is correct, then why would we even need the "subassets" field? couldn't we directly put all the properties at "B01" level next to "title" etc.? |
Beta Was this translation helpful? Give feedback.
-
Again on the name topic. If it is correct to say that this new "object" can be placed either inside an asset but also inside the "properties" object, at that point the "subassets" name wouldn't fit that much. Maybe in that case "layers" could be better. |
Beta Was this translation helpful? Give feedback.
-
I will advocate again for a very generic and abstract term. Indeed, as @chiarch84 noted, this new object can be placed at different levels and define a "subset" of the object where it is placed. Every following term proposed is actually associated to the data model of the asset:
That is why I find
This term is often used in many IT implementation: Since group does not carry any information about the data structure, I also proposed a field to define it's type (e.g. "band", "layer", "database") and would also allow hybrid grouping (multi type assets) |
Beta Was this translation helpful? Give feedback.
-
In the proposal above I recycled the behavior of eo:bands which means that eo:bands summarizes (i.e. merges to a set) the bands in Item properties. I think that is not a good idea as then it's not clear how the relation between the array index and the band position is. Thus we have basically two different behavior in one file for the same field. Thus, I'd propose to define it more like proj:epsg for example is used:
|
Beta Was this translation helpful? Give feedback.
-
Thanks a lot, @m-mohr, for an excellent RFC! I like the idea of generalising the definitions for the subset of data. Mainly the fact that the object itself is extensible. So, great work!
|
Beta Was this translation helpful? Give feedback.
-
I'm -1 on using a generic object with types, at least for now. We have a very clear use case with raster bands, with lots of data that would benefit. I think the case for the others is much less clear. And I'd much prefer we 'generalize' based on real usage, where we have lots of examples where things are similar and then we decide to combine them. So I'm all for similar 'subasset' type definitions / extensions, and if they all get widely used then we could combine them down the road into a more generic structure. I fear that if we make things too generic at this level then interoperability will suffer. And we have a great extension mechanism, so let's use that, with nice, precise definitions for each. And then we can leverage extension maturity to help guide people on them. The specific ideas of ' I suppose you can use the subasset/group construct at the collection level. But I'd much prefer to have a well-defined 'vector' collection extension, that defines its fields, instead of a super generic 'group'. So please, let's start with this modest improvement of 'bands', so that raster:bands and eo:bands work better together. And then we can continue to evolve to be more generic & meet the proposed use cases. |
Beta Was this translation helpful? Give feedback.
-
RFC: A common construct to replace
raster:bands
andeo:bands
Table of contents:
2. Affected specification / extensions
3. Changes to the STAC specification
4. Changes to the raster extension
5. Changes to the EO extension
6. Changes to the extension template
Context
The EO extension in the very early days of STAC added the field
eo:bands
, which was meant to describe the spectral characteristics of a band. As it was first, it also defines some non-EO related fields likename
anddescription
. On the way to STAC 1.0, @emmanuelmathot proposed the raster extension. It had a construct for raster bands (or layers) in general:raster:bands
. The raster extension doesn't define these common fields such asname
anddescription
. Other fields likenodata
,data_type
,statistics
andunit
also don't feel closely related to only rasters as they could also be used in the table or Datacube extensions.Problem
Having both extensions around for quite a while now, we see broad adoption of both extensions. Implementations are using both extensions side-by-side or move fields from one extension to the other. Below you can find some examples for assets, which shows issues that have been identified in real world deployments / implementations.
Example 1
This is fully valid and implements the fields as defined:
The issues encountered here are the following:
raster:bands
corresponds to which element ineo:bands
. Their relation is implicitly expressed by the array index, but that's not shown in JSON and as such you'd need to count the elements get the index.raster:bands
), it should - strictly speaking - not be ineo:bands
as it is not a spectral band. This leads to a mismatch and it makes it less trivial to merge the bands. Also, the quality band (e.g. cloud cover probability) has no name or description, which you could solve in various ways but they lead to other issues:2. You could add it to all raster band elements, but then you are duplicating data, use fields that are not defined and tooling may run into conflicts.
Example 2
What I've seen in openEO Platform is that people then merge the bands and use them more freely.
This leads to a couple of issues:
stac_extensions
so they won't be validated. As such the invaliddata_type
values in this example would not be reported by a validator (they must be lower-cased).spatial_resolution
anddata_type
. But right now these fields are not defined on the asset level (andgsd
doesn't always apply or is not correct).Example 3:
Here's another example how the
eo:bands
construct was used (1) to describe non-spectral (i.e. SAR) "bands" and(2) to add additional fields that would've been defined in other extensions,
but instead new provider specific fields (
gee:units
,gee:polarization
) were defined:Proposed solution
The general idea is to add a new field to the common metadata, which is an array of objects that allows to host fields from various extensions.
The objects would be yet another place where STAC can add all kinds of fields as in Item Properties or Assets, e.g. from common metadata itself.
title
,description
,gsd
anddatetime
.name
field for the identifier again.nodata
,data_type
,statistics
andunit
we could also add to common metadata.center_wavelength
will beeo:center_wavelength
) and get defined as "top-level" fields in the respective extension so that they could also be used in other places such as assets or the Item properties.Naming
The name of this new field is still to be defined and we are looking for feedback here. It could simply be
bands
and the object could again be called "Band Object" like it is already in the EO and raster extensions.On the other hand, a construct such as bands that basically defines "sub datasets" could also be made more general and as such have a different and more general name such as
layers
,groups
,subdatasets
,subassets
etc. We'd hope that other domains could also use this construct, such as point clouds, data cubes or tabular data.Affected specification / extensions
The changes proposed will require collaboration between multiple different specifications and also new versions.
The affected specifications are at least:
In general, other extensions could also be broadened up to support adding their fields to the new construct in a backward compatible way (requires a new minor version), for example:
Some extensions could reuse the newly defined fields in common metadata:
data_type
(breaking change, requires v2.0)statistics
andunit
(already aligned) in the Dimension and Variable Objects (requires v2.2.0)statistics
(breaking change, requires v2.0)Changes to the STAC specification
Please keep in mind that in the following I've used the terminology "band" for the new construct, but we'd like to get feedback regarding the terminology. Depending on the feedback we may change "band" to something else.
The following new sections could be added to the common metadata:
Changes to the raster extension
The extension will be restructured:
raster:bands
gets deprecated and the fields in the Band Object that have not been migrated to the common metadata will be defined "top-level" with araster:
prefix.Chapters for fields that have been migrated to common metadata will be moved ("Data Types" and "Statistics Object").
Changes to the EO extension
The extension will be restructured:
eo:bands
gets deprecated and the fields in the Band Object that have not been migrated to the common metadata will be defined "top-level" with aeo:
prefix.The chapter about the "name" will be removed. Most of it has been migrated to the definition in common metadata or is not relevant anymore (e.g., due to the availability of a
tile
field).eo:cloud_cover
doesn't change its definition, but can be used in a broader scope (before only Item Properties and Assets).Changes to the extension template
The extension template repository would be updated to allow adding fields to bands in the JSON Schema (by default?) and also list the bands as a potential scope for the fields:
Alternative considered
Keep
raster:bands
instead of defining a new constructAn alternative that was also discussed in the STAC PSC is that it could be "less breaking" if we keep
raster:bands
as it is.The EO extension would be changed as proposed above, but instead of a new construct in common metadata
raster:bands
would be extended by other extensions such as EO.The drawback is that it's specific to raster data although such a concept could also be useful in other domains.
Also, the general definition would be a bit more complex as other extensions would extend an extension.
STAC is meant to be extended, but extensions are not primarily meant to be extended.
We already see this with the "Item Assets Definition" extension where we are in the weird spot that common metadata fields can't be validated directly
in the Item Assets Definition as that would lead to a "circular" schema (Item Assets Definition extends STAC and STAC extends Item Asset Definitions).
The extensions also evolve faster than STAC does and as such more downstream changes in the extensions would be anticipated once the raster extension changes.
Keep everything as it is but define new fields in
raster:bands
We could also keep the EO and raster extension as they are and only add missing fields such as
name
anddescription
to the Raster Band Object. This would not solve most of the issues explained above so it was not considered any further.Implement the proposed changes but directly remove
eo:bands
andraster:bands
Instead of deprecating
eo:bands
andraster:bands
we could also directly remove them and release new major versions instead of minor versions. While this is very clean, it can lead to downstream issues.For backward compatibility we assume that many implementations want to keep
eo:bands
andraster:bands
for some time tolet people migrate to the new definitions. During that period validators may report these fields as invalid as they schemas are defined in a way that then undefined fields (
eo:bands
andraster:bands
) with the prefix of the extension are reported as invalid.This could hurt adoption of the new extension versions.
Open questions
raster:scale
,raster:offset
andraster:histogram
may also be relevant for non-rasters. This would result in a very slim raster extension, which only contains the fieldsraster:spatial_resolution
,raster:sampling
andraster:bits_per_sample
.Procedure
We'd like to get feedback from the STAC community and are open for any feedback. Please provide the feedback preferrably in this thread or alternatively directly to the STAC PSC. The RFC is open at least until end of March.
Disclaimer: This RFC was written by me with funding from Planet Labs. An earlier version of this RFC was discussed in the STAC PSC and a STAC community meeting.
Beta Was this translation helpful? Give feedback.
All reactions