-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task: unify odc.stac.transform
with odc.stac._eo3
#355
Comments
Unless random UUIDs are requested generate the same UUID for STAC items with the same id from the same collection.
Unless random UUIDs are requested generate the same UUID for STAC items with the same id from the same collection.
EO3 uses some pre-STAC 1.0 properties to lookup some keys, this is particularly important for star/end dates.
Unless random UUIDs are requested generate the same UUID for STAC items with the same id from the same collection.
EO3 uses some pre-STAC 1.0 properties to lookup some keys, this is particularly important for star/end dates.
Unless random UUIDs are requested generate the same UUID for STAC items with the same id from the same collection.
EO3 uses some pre-STAC 1.0 properties to lookup some keys, this is particularly important for star/end dates.
Unless random UUIDs are requested generate the same UUID for STAC items with the same id from the same collection.
EO3 uses some pre-STAC 1.0 properties to lookup some keys, this is particularly important for star/end dates.
progress so farUUID generation is now deterministic and sufficiently configurable, default UUID resolution goes like this
Property name remapping to match still to do
|
odc.stac.transform
with odc.stac._eo3
odc.stac.transform
with odc.stac._eo3
Another missing feature in |
Having different deterministic UUIDs will break things for me, yeah. If I've indexed Sentinel-2 or Landsat 8 data, which doesn't come with a UUID already, I'm relying on that being consistent to know if it's already in the DB. What's wrong with the current deterministic ID? |
It's not generic enough, special rules for specific product names: odc-tools/libs/stac/odc/stac/transform.py Lines 293 to 303 in 6c7a8bf
Also I don't think using I guess we can allow user-supplied uuid generation function, in here: odc-tools/libs/stac/odc/stac/_eo3.py Lines 478 to 497 in 6c7a8bf
|
I guess having a user-defined function for UUID generation is a good workaround. It'll need to be added as a parameter for the dc_tools suite. That can happen later, though. And the impact isn't important, because I can keep running old code from old docker images to index with. |
code has been moved into apps, |
Problem Description
Module
odc.stac.transform
(previouslyodc.index.stac
) was developed beforepystac
was available and before STAC 1.0 was finalized. It's purpose is to translate a STAC document to EO3 compatible Dataset definition document suitable for indexing to datacube.There are some issues with the current implementation that I would like to address
pystac
for better robustness)Module
odc.stac._eo3
does similar thing, except the goal was to produceDataset
objects suitable for callingdc.load
with, rather than a yaml document suitable for indexing. As such it's missing some of the capabilities required byodc.stac.transform
, such as deterministic UUID generation, lineage extraction, region code and other metadata massaging.Sub-tasks
odc.stac._eo3
(currently using random).id
as is if it contains UUID (DEA datasets).id
+.collection_id
+ (optional other fields configured by user per collection) to compute deterministic UUIDodc.stac.transform
to useodc.stac.stac2ds
possibly with some further metadata tweaking post conversion (region code, product href, lineage)Note that deterministic UUIDs have potential to benefit
stac_load
as well when used with Dask. Non-remapping of properties is probably a bug, as time ranges are probably broken currently: EO3 metadata looks for old names forend_datetime,start_datetime
.CC: @alexgleith @gadomski
The text was updated successfully, but these errors were encountered: