Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to datacube extension helper #51

Merged
merged 5 commits into from
Feb 26, 2024
Merged

Fixes to datacube extension helper #51

merged 5 commits into from
Feb 26, 2024

Conversation

dchandan
Copy link
Collaborator

  1. Some variables (e.g. time_bnds) do not have variable "attributes" in certain model files. As a result the extension code crashes. 8621b6d fixes that.
  2. As per CF convention bounds variables are not required to have attributes and that they should instead be inferred from the corresponding coordinate variables. 6326908 addresses that.

An example of how the data cube extension looks like with these changes is here.

An outstanding issue is that time_bnds variable is still not being inferred as an auxiliary variable. This will require a hack since this variable cannot be determined to be auxiliary using the framework currently in place (because the variable in the netCDF file will not contain any of the "criteria" that could be put into place for the benefit of the is_coordinate).

@dchandan
Copy link
Collaborator Author

@huard your review would be helpful here

Comment on lines 192 to 196
try:
attrs = meta["attributes"]
except KeyError:
# Some variables like "time_bnds" in some model files do not have any attributes.
attrs = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try:
attrs = meta["attributes"]
except KeyError:
# Some variables like "time_bnds" in some model files do not have any attributes.
attrs = {}
attrs = meta.get("attributes", {})

Comment on lines +210 to +222
def _infer_variable_units_description(self, name, attrs):
"""Try to infer the units and description of some simple coordinate variables."""
if name == "time_bnds":
related_variable = "time"
attrs["description"] = "bounds for the time coordinate"
elif name == "lat_bnds":
related_variable = "lat"
attrs["description"] = "bounds for the latitude coordinate"
elif name == "lon_bnds":
related_variable = "lon"
attrs["description"] = "bounds for the longitude coordinate"
else:
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you'd be able to harden the bound variables inference using cf-xarray. It looks for the bounds attributes of variables. See https://cfconventions.org/cf-conventions/cf-conventions.html#cell-boundaries

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I start to remember. We didn't want to add cf-xarray as a dependency, right? So we just copied some of the logic here. Either we bring more of the cf logic into the repo, or add is as a dependency ? No strong opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then we can keep it cf-array free for now. I think this will be the path of least resistance, and time is a bit of an issue at the moment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if the fix works for your datasets, all good. I'm just saying don't rewrite cf-xarray if you hit corner cases and need to make that logic more comprehensive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@david now all the bounds variables are shown as auxiliary

Copy link
Collaborator

@fmigneault fmigneault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huard is in better position to answer regarding these attributes. Will let it up to both of you.

@dchandan dchandan merged commit b3d5e0e into master Feb 26, 2024
7 checks passed
@dchandan dchandan deleted the datacube-fixes branch February 26, 2024 22:57
dchandan added a commit that referenced this pull request Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants