-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-1891 - Aggregated time dimensions only include input values. #41
Conversation
CHANGELOG.md
Outdated
@@ -1,3 +1,10 @@ | |||
## v1.1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was on the fence about the semantic version number to use. This feels like a pretty significant change, but it's not changing the input parameters or anything, only the output. Open to thoughts here, for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I have no vote... I would vote for at least a minor level bump to v1.2.0 since someone using the zarr store would possibly need to change their code (and the change to zarr is a change in behavior, not a bug fix like with most patch version updates).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chris beat me to this, we didn't break the api, but we did change the output so that a user has to do something to adapt their code. +1 for at least a minor. I could be convinced of a major.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both. I think I might opt for a minor version number change because, while this update does change the nature of the values in the temporal dimension, we're not adding/removing dimensions in the final Zarr store. Does that sound like reasonable/sane logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits, but need to decide what version it should be. Nice work.
CHANGELOG.md
Outdated
@@ -1,3 +1,10 @@ | |||
## v1.1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chris beat me to this, we didn't break the api, but we did change the output so that a user has to do something to adapt their code. +1 for at least a minor. I could be convinced of a major.
``` | ||
return self._get_output_dimension(dimension_name, all_input_values, | ||
output_dimension_units) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good information, is there a better place for it? Is there a ticket to implement User's choice of aggregation methods later? Not actually sure what I'd want to see so I'm just putting down whatever hits my brain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - this is a good question. Honestly, I'm not sure where else to put this. We don't have a ticket for allowing a user to decide for a continuous output dimension vs just using the input values.
There's the commit history, but I assume that only you or I would know to dig back into that.
Description
This PR updates the way the NetCDF-to-Zarr service performs temporal aggregation. Previously it would ensure the aggregated temporal dimension was regularly gridded. However, some collections have 1 granule per month, and that granule is not even taken at the same time of day, so we were getting very sparse outputs. Additionally, the performance for such collections was prohibitively poor. Instead, the service will now produce a temporal dimension in the aggregated Zarr store that only contains values corresponding to those in the input granules.
Jira Issue ID
DAS-1891
Local Test Steps
Note the test cases in the notebook do not use the PO.DAAC collection with reported issues, as we do not have a mirror of that in EEDTEST. But the behaviour can still be confirmed with GPM_3IMERGHH.
PR Acceptance Checklist
version.txt
andCHANGELOG.md
updated (if publishing a new release).