Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create encoding-specific conformance classes #166

Open
jerstlouis opened this issue Apr 27, 2022 · 21 comments
Open

Create encoding-specific conformance classes #166

jerstlouis opened this issue Apr 27, 2022 · 21 comments
Assignees

Comments

@jerstlouis
Copy link
Member

jerstlouis commented Apr 27, 2022

This would follow the approach used in OGC API - Features, Tiles, Maps...

Proposed conformance classes:

  • CIS JSON (+RDF to include JSON-LD @context ?)

  • netCDF

  • GeoTIFF

  • CoverageJSON

  • LAS/LAZ?

  • Zarr?

  • JPEG2000 with GML/JP2?

  • JPEG XL?? (eventually: how to add GML/JP2 or JSON in JPEG XL?)

  • PNG? (to populate GeoPackage Gridded Coverage Extension) -- with factor and offset query parameters

  • Cloud native conformance class indicating a cloud optimized profile and support for HTTP Range (+ presence of overviews?) COG #93

@pebau
Copy link
Contributor

pebau commented Apr 27, 2022

@jerstlouis The right hierarchy would be

  • CIS
    • JSON
    • RDF
    • netCDF
    • GeoTIFF
    • etc.
  • CoverageJSON

@jerstlouis jerstlouis mentioned this issue Apr 27, 2022
@jerstlouis
Copy link
Member Author

jerstlouis commented Apr 27, 2022

Thanks @pebau, as discussed last week, the native netCDF and GeoTIFF physical encodings and logical models do not conform to the CIS logical model.

The

OGC® GML Application Schema - Coverages - GeoTIFF Coverage Encoding Profile

and

CF-netCDF 3.0 encoding using GML Coverage Application Schema

used together with WCS are a different story, rooted in legacy GMLCOV, which we want to avoid in OGC API - Coverages.

The GML Application Schema - Coverages - GeoTIFF Coverage Encoding Profile specifies a way for GMLCOV to reference external GeoTIFF files transferred separately (e.g., using multi-part encoding).

The CF-netCDF 3.0 encoding using GML Coverage Application Schema specifies how to map a CF-netCDF datastore to a GMLCOV output.

Both involved transferring the coverage primarily as GMLCOV (aka CIS 1.0). They are NOT a netCDF and GeoTIFF encoding of the coverage.
So proper netCDF and GeoTIFF encoding do not fall under CIS.

We discussed this extensively last week, and several times before as well.

We discussed RDF again today, and we could potentially include CIS RDF as an additional conformance class (or maybe as part of the same conformance class, since it's just mostly the same with extra context links), but we would need a dedicated media type for it. Since CIS Requirement 38 explicitly calls for JSON-LD, perhaps it could be done together with defining a proper media type for CIS JSON as an additional parameter to the CIS JSON media type? That is the topic of opengeospatial/coverage-implementation-schema#19 .

For now we could potentially use application/ld+json to mean "CIS JSON" + RDF.
If I understand the examples correctly, the encoding is the same as CIS JSON plus the @context links.

Today we discussed the .n3 encoding and there was some confusion as whether that is the same data encoded in a different way, or it provides some context. It seems to me that it might be the latter, since e.g., I could not find the same rangeset encoded
in 20_3D_height.n3 as in 20_3D_height.json.

@joanma747 clarifications would be most welcomed.

@m-mohr
Copy link

m-mohr commented Jul 3, 2023

Shouldn't this be extensible? If you prescribe allowed options, what it someone defines COG 2.0?

@jerstlouis
Copy link
Member Author

@m-mohr Yes, it is always fully extensible.

As with OGC API - Features, Tiles, EDR and others, even though some conformance classes are defined in the standard itself to facilitate declaring support for common formats, the standard is clear that an HTTP Accept header can always be used to negotiate any other format (and if necessary, a future extension/part to the standard could define a conformance class for those).

@m-mohr
Copy link

m-mohr commented Jul 3, 2023

Great to hear. If I can request all file formats through media types (as long as there is one), why do they need conformance classes?

@jerstlouis
Copy link
Member Author

jerstlouis commented Jul 3, 2023

@m-mohr For the following reasons:

  • Following in the footsteps of Features which had GeoJSON, GML, etc. as conformance classes
  • To provide a list of formats from which several implementations will at least support one of
  • Potentially as a way to procure (or test from the ETS) an implementation that supports at least one or more of those common formats
  • As an easy way for clients to know beforehand which formats are available (at least for those common formats) -- related issue: How to do links with content negotiation: Adding "types" to "link" ogcapi-common#160

@jerstlouis
Copy link
Member Author

An initial draft of the encoding-specific requirements classes have been merged with #172.

@m-mohr
Copy link

m-mohr commented Sep 6, 2023

So how do I detect a unique list all supported file formats? I'm implementing a client and it must let the user choose the file format he wants to load.

As far as I understand:

  1. Get list of conformance classes and filter the file format specific ones out. Q: How do I know which ones are file formats if there can always be non-OGC conformance classes, too?
  2. Get all links with a specific rel type and list the media types. Q1: What happens if I want to expose STAC JSON and Records JSON, which both currently just use the application/json media type? Q2: What is meant to happen if I find a link without a type property?

I guess the relation types in links take preference over the conformance classes in such a case? Or should they be merged? If yes, is there a way to map conformance classes to media types?

@jerstlouis
Copy link
Member Author

jerstlouis commented Sep 6, 2023

@m-mohr Tough questions ;) But they apply equally to all OGC API standards, so something to also discuss in Common and/or with the Cross-SWG issue discussions. But trying to answer each point here:

Q: How do I know which ones are file formats if there can always be non-OGC conformance classes, too?

Not possible for those non-OGC conformance classes in a generic way. Clients that support specific formats can easily identify conformance classes that they recognize whether part of the standard or coming from somewhere else. The reason to include these conformance classes in the first place is to at least have a standard way to identify common formats expected to be supported by several implementations, so at least all clients can recognize these. Generically recognizing conformance classes outside of the standard is not possible.

Q1: What happens if I want to expose STAC JSON and Records JSON, which both currently just use the application/json media type?

On one hand this could be the general problem that application/json is really not an encoding implying a particular logical model, like XML, it's a generic transport for arbitrary logical models, and negotiating a particular logical model/physical encoding really needs more information, either using a different media type or with the negotiation by profile ( opengeospatial/ogcapi-common#8 ). In this specific case of STAC vs. Records though it may be related to the question of whether STAC is a profile of records ( has that been resolved in opengeospatial/ogcapi-records#178 ?). Is this related to opengeospatial/ogcapi-records#276 ? I am not clear how any of this relates to Coverages though ? Is this about /items?

Q2: What is meant to happen if I find a link without a type property?

If a link does not include the type property, then the resource likely supports multiple (possibly several) representations, and the client should use an Accept: header with those representations that it supports. If this is /coverage and the implementation lists those encoding requirements classes in /conformance, then the implementation supports all of these. The client should assume that the server does not support those encodings if the requirements classes defined in the standard are not listed.

I guess the relation types in links take preference over the conformance classes in such a case? Or should they be merged?

Do you mean the type and not the rel here? The type specifies the type for a particular link (e.g., /coverage?f=coveragejson), but there should either be no type, or a link of each type listed in the conformance classes. The two should be consistent. We also explicitly define the /collections/{collectionId}/coverage resource so a client could just go straight there and use Accept: with one of the encodings listed in conformance classes (or the list of all encodings it supports, hoping one will be supported by the server).

If yes, is there a way to map conformance classes to media types?

There is a table in the latest draft that does exactly this:

https://github.com/opengeospatial/ogcapi-coverages/blob/master/standard/clause_14_encodings.adoc#media-types

NOTE: I believe all of this is consistent across all OGC API data access standards, including published Features, EDR and Tiles, as well as upcoming Maps, DGGS, 3D GeoVolumes... Perhaps we should add clarifications regarding all this in Common - Part 2: Geospatial data that these all depend on.

@m-mohr
Copy link

m-mohr commented Sep 6, 2023

Not possible for those non-OGC conformance classes in a generic way.

Okay, so this is not a reliable way to generate a list of supported file formats. Let's look at the other ones then (i.e. the links).

On one hand this could be the general problem that application/json is really not an encoding implying a particular logical model, like XML [...] I am not clear how any of this relates to Coverages though ? Is this about /items?

It was just meant as examples to visualize the issue better. But in principle this can occur for all media types. If I want to support two different flavors of ZARR or netCDF, the same applies basically. I just think that content negotiation by media types (IANA is in priciple relatively strict with "additions" to media types) might get problematic/ambiguous.

In this specific case of STAC vs. Records though it may be related to the question of whether STAC is a profile of records ( has that been resolved in opengeospatial/ogcapi-records#178 ?).

STAC will not be a profile of Records in the first iteration, but might be in the future. It is closely aligned but not fully.

Is this related to opengeospatial/ogcapi-records#276 ?

Yes.

If a link does not include the type property, then the resource likely supports multiple (possibly several) representations, and the client should use an Accept: header with those representations that it supports. If this is /coverage and the implementation lists those encoding requirements classes in /conformance, then the implementation supports all of these. The client should assume that the server does not support those encodings if the requirements classes defined in the standard are not listed.

So If I write a client such as the GDC Web Editor which let's the choice to the user and in principle supports all media types, what shall I do? I want the user to choose from all supported file formats.

Should I send a HEAD .../coverage request and send header Accept: */*, but what do I get back then? The Content Type is usually just a single value, not a list, right?

Do you mean the type and not the rel here? The type specifies the type for a particular link (e.g., /coverage?f=coveragejson), but there should either be no type, or a link of each type listed in the conformance classes. The two should be consistent.

I meant if I have a list of supported file formats from the conformance classes and one from the type in the coverage links, which list should I show to users in the GDC Web Editor? I assumed the links have preference, although I can't show a list if there's no type set (see above).

https://github.com/opengeospatial/ogcapi-coverages/blob/master/standard/clause_14_encodings.adoc#media-types

Good to know, thanks.

@m-mohr
Copy link

m-mohr commented Sep 6, 2023

@jerstlouis Copied from another Gitlab issue to have everything in one place:

Regarding the media type when no type is included, it should be considered multiple, with all of the listed coverage encodings supported as listed in /conformance.

Above, you've said above I can't reliably read that from the conformance classes. So it looks like there is no definitive way to get a list of supported file formats right now, right? This should be fixed, I think.

@jerstlouis
Copy link
Member Author

Let's look at the other ones then (i.e. the links).

That's not a reliable way either because several implementations may opt to have a single /coverage link without a type.

@m-mohr
Copy link

m-mohr commented Sep 6, 2023

I didn't say that it's reliable, I just wanted to link to the discussion of the other alternative below. In fact, I added in the other post:

So it looks like there is no definitive way to get a list of supported file formats right now, right? This should be fixed, I think.

Otherwise, you/I can't write a user-friendly client.

@jerstlouis
Copy link
Member Author

So If I write a client such as the GDC Web Editor which let's the choice to the user and in principle supports all media types, what shall I do? I want the user to choose from all supported file formats.

Should I send a HEAD .../coverage request and send header Accept: /, but what do I get back then? The Content Type is usually just a single value, not a list, right?

I think this is related to our earlier discussion about at which point you care about which media type to use.
In the end, whether in a particular hop of a workflow, or in the end-user client application, something will need to visualize or analyze that coverage data, and this will support a specific list of formats.

This is where the Accept: ... list should ultimately come from, and would never be */*.
In the case of the GDC editor, is there a need to ask the user to pick one?
It may be the list of the formats that the browser supports, or a predetermined list of formats that a particular GDC profile requires to be implemented.

Otherwise, you/I can't write a user-friendly client.

Just as you said earlier, possibly in a slightly different context:

the user although they probably don't care which file format is used internally to retrieve the data. They just care what comes out at the end... :-)

I don't agree with your point that the user needs to specify the file format for the coverage request.

Ideally, I think a user-friendly client should not require the user to deal with formats :)

I meant if I have a list of supported file formats from the conformance classes and one from the type in the coverage links, which list should I show to users in the GDC Web Editor? I assumed the links have preference, although I can't show a list if there's no type set (see above).

I think the list of supported formats to show would be all of those from the conformance classes, plus any additional conformance classes not in the standard that the client is aware of, plus any type ones that are not already mapped to any of these.

So it looks like there is no definitive way to get a list of supported file formats right now, right? This should be fixed, I think.

Nothing better than the above, but this applies exactly the same to all OGC API data access standards, including Features.

@m-mohr
Copy link

m-mohr commented Sep 6, 2023

Ideally, I think a user-friendly client should not require the user to deal with formats :)

No! As a user I want to choose whether my end result is a COG or a netCDF.
In-between the steps/Internally it doesn't care, but the end result I want to choose. And to make an informed decision I need to know upfront which file formats are available. There's no reliable way for this in OGC APIs it seems, so it should be added. Where should such a feature be requested? Here? In Commons?

@jerstlouis
Copy link
Member Author

jerstlouis commented Sep 6, 2023

As a user I want to choose whether my end result is a COG or a netCDF.

In a connected environment the end-user should not need to care and just see results or get an answer to their question, but this is getting philosophical :) Of course the need for intermediate exports of data still persists, so this is a use case to address.

And to make an informed decision I need to know upfront which file formats are available

As stated above the client can get this right now:

  • all of those from the conformance classes,
  • plus any additional conformance classes not in the standard that the client is aware of,
  • plus any type ones that are not already mapped to any of these.

This may not seem ideal, but it's not that bad in practice.

If OGC API - Common - Part 1 is supported, and the client supports dynamically parsing the particular API definition language used to describe the API (e.g., OpenAPI) for the deployed implementation, then of course the client can also look at the paths responses media types.

Where should such a feature be requested? Here? In Commons?

This issue opengeospatial/ogcapi-common#160 that I linked to on GitLab is the central place for this discussion I believe. Perhaps optional requirement classes to define requirements for HEAD and/or OPTIONS and/or something similar to the list of media types available in openEO could be a solution? The list of media type would need to be resource type-specific.

@m-mohr
Copy link

m-mohr commented Sep 6, 2023

This may not seem ideal

It is not ideal, it is not reliable and I can't build a client on top of it.

something similar to the list of media types available in openEO could be a solution? The list of media type would need to be resource type-specific.

openEO defines the file formats globally for input and output respectively. We don't really have different resource types (except for a broad distinction between raster, vector, table and "other",

opengeospatial/ogcapi-common#160

Thanks, added a short comment there, linking back to this discussion.

@jerstlouis
Copy link
Member Author

jerstlouis commented Sep 6, 2023

@m-mohr

it is not reliable and I can't build a client on top of it.

The above strategy (standard/client-recognized conformance classes + type) is fully reliable (with the only exception of media types not part of the standard conformance classes and of which the client is not aware, when also not using "type"), and I think a client can detect / offer that choice, though it might be a bit of work, so it is not clear what you mean by "I can't build a client on top of it".

Are you referring specifically to that exception, do you mean it is significantly more work than it should be, or is there really a technical limitation that I am missing?

We don't really have different resource types (except for a broad distinction between raster, vector, table and "other",

For OGC APIs, strictly for "data" we at least have coverages, maps, feature collections, individual features, vector tiles, coverage tiles, map tiles.

Then there are all the metadata resources like collection lists / collection descriptions, process lists / process descriptions...

@cnreediii
Copy link
Contributor

@m-mohr I concur with your statement that the user should be able to 1) get a list of available formats and 2) choose the one they want. Scanning conformance classes to get this information seems overly complex :-)

@jerstlouis
Copy link
Member Author

Feb 24 code sprint: do we need a separate conformance class with requirements for GML/JP2 ?

@m-mohr
Copy link

m-mohr commented Nov 12, 2024

Here are some more that eventually may need conformance classes: https://gdal.org/en/latest/drivers/raster/index.html
And that's why this doesn't scale well and there needs to be more than just conformance classes for file format discovery.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants