-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protocol Buffer version of STAC for use with gRPC #575
Comments
This seems like great work, although I'm not very familiar with gRPC. Then I have a question regarding the Protobuf Field Names of the extensions: We have an EO and SAR extension. Both have a field bands with a different definition. The Protobuf Field Name is bands. Would it be a problem if it's name is also bands for SAR? Or would it make more sense to rename the Protobuf Field Names to include the prefix? For example, |
Thank @m-mohr! After FOSS4G I'll make a PR to add it to the Third Party Vendor Extensions and another PR to add it to the implementations page. With the current protobuf definition I've submitted we have the c_eo = stac_item.eo.constellation
c_sar = stac_item.sar.constellation
b_eo = stac_item.eo.bands
b_sar = stac_item.eo.bands If we changed the names to the underscore it would look a bit redundant as you can see below: c_eo = stac_item.eo.eo_constellation
c_sar = stac_item.sar.sar_constellation
b_eo = stac_item.eo.eo_bands
b_sar = stac_item.eo.sar_bands |
Thank you for clarifying, I didn't catch that there's the scoping between the two dots. Now your approach makes much sense. |
+1 - great work @davidraleigh. And definitely agree that adding it to Extensions makes good sense, starting as a third party extension and could evolve to be included in the STAC extension repo (we have some work to figure out exactly how we sort out where we put and group extensions). |
/sub |
Submitting an issue to ask for input on the included Protocol Buffer definitions that attemp to match JSON STAC. If some of you all could review the below tables and give input as to whether this is an acceptable STAC-like implementation, that would be great. I'd love to eventually fold Protobuf STAC into the stac-spec or have it be a community accepted project. Please, let me know what I can do to make that happen.
Why gRPC and Protobufs? gRPC is a high performance micro-service RPC framework that allows bi-directional streaming and uses compact a data formats. Protobuf is the standard compact message data format for gRPC. Protobuf and gRPC are open source Cloud Native Computing Foundation projects. They are originally open sourced by Google and used since 2003. At this time Google executes 10s of billions of RPC messages a second with gRPC and Protobuf, so you can rest assured it's stable.
The repo that holds the proto IDL files and their generated code is here:
https://github.com/geo-grpc/api
Some documentation generated from the proto files can be found here:
https://geo-grpc.github.io/api/#epl%2fprotobuf%2fstac.proto
A Python Client can be found here:
https://github.com/nearspacelabs/stac-client-python
There are some limitations about how you define Protocol Buffers that prevents a one-to-one match of STAC. Please look at the tables for differences and the lists of explanations beneath each table. The most significant departure is that Properties would be reserved for user defined data that is outside of the STAC specification, and the data defined by the STAC specification would exist directly on the StacItem definition. Other differences include use of a GeometryData protobuf and a preference for enums wherever possible.
STAC Item Comparison
For Comparison, here is the JSON STAC item field summary and the Protobuf STAC item field summary. Below is a table comparing the two:
List of Item Spec differences and explanations:
type
field isn't implemented as this is GeoJSON specificgeometry
field usesGeometryData
instead of GeoJSON. This choice is about message size and about using other projections besides WGS84GeometryData
protobuf container for geometry information allows the user to define the geometry vertex information using wkt, wkb, esrishape or geojson['geometry'].GeometryData
has a SpatialReferenceData field that allows us to define the projectionbbox
field is defined usingEnvelopeData
xmin
,xmax
, etcEnvelopeData
has a SpatialReferenceData field that allows us to define the projectionproperties
This is the trickiest departure from JSON. As described above,properties
would be reserved for user defined data that is outside of the STAC specification. This would be done using the google.protobuf.Any for packing non-specification data into the message.links
This could be implemented. We just haven't found it useful with protobuf and gRPC at this time. (there are no links in gRPC, just Remote Procedure Calls)assets
Is implemented using the proto3 map and shows up in documentation asStacItem.AssetsEntry
datetime
is defined using thegoogle.protobuf.Timestamp
field. This follows the recommendation from Google for matching JSONobserved
,processed
, andupdated
are all fields outside of the STAC specification, but I've found them enormously useful. In our case, ourStacItem
objects duplicate theobserved
Timestamp
value in thedatetime
field in order to stay compliant with STAC. Maybe we need to make an additional issue requesting these be optional reserved fields for the item-spec.eo
andsar
, here is the other effect of Protobuf not being a flexible hash map like JSON. Sinceproperties
has to be reserved for user defined Protobuf definitions, the extensions likeeo
,sar
,datetime_range
, etc, must be defined within the ProtobufStacItem
spec. An unused message accounts for only a byte in the total message size, so including these definitions at theStacItem
level don't cause any type of memory bloat. Deciding when an extension should be included might be tricky.landsat
there isn't a landsat extension yet, so maybe this should be a separate issue.Eo Comparison
For Comparison, here is the JSON STAC Electro Optical field summary and the Protobuf STAC Electro Optical field summary. Below is a table comparing the two:
List of Eo Spec differences and explanations:
gsd
,cloud_cover
,off_nadir
,azimuth
,sun_azimuth
andsun_elevation
, (ie all fields that have a JSON data type ofnumber
) are all set to typegoogle.protobuf.wrappers.FloatValue
in Protobuf.platform
,instrument
andconstellation
are all string in JSON Data Type definition, but in Protobuf they're defined as enums. This might be problematic for someone wanting to put their own instrument data in theEo
object. The reason for choosing Enums is to avoid conflicts and confusions that come about from string definitions. It also allows for more compact storage and querying in database.bands
at this time we don't have a working implementation that uses thebands
fieldAsset Comparison
title
isn't used in any of our implementationsasset_type
is an enum alternative to the string basedtype
field. We prefer enums to strings in all possible cases.cloud_platform
andbucket_region
are useful for minimizing egress costsbucket
andobject_path
are useful for some of our streaming and FUSE cases.requester_pays
is an important piece of information for managing costs.Catalogs and some other features of STAC have not been implemented. The query language for gRPC can be seen in the [https://geo-grpc.github.io/api/#epl.protobuf.StacRequest] overview. Examples of queries can be seen in the python client: https://github.com/nearspacelabs/stac-client-python#queries
Thank you for reading some or all of this!
The text was updated successfully, but these errors were encountered: