Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add averages with time dimension removed #236

Merged
merged 25 commits into from
May 23, 2022
Merged

Conversation

tomvothecoder
Copy link
Collaborator

@tomvothecoder tomvothecoder commented May 10, 2022

Description

This PR adds the ability to calculate averages weighted by time with the time dimension removed (via ds.temporal.average()). It simply wraps xarray's multi-line function calls into an API, with an example below for comparison.

# xarray
month_length = ds.time.dt.days_in_month
weights = (
    month_length.groupby("time.month") / month_length.groupby("time.month").sum()
)
pr_avg_xarray = ds.pr.weighted(weights).mean(dim="time")

#xcdat
pr_avg_xcdat = ds.temporal.average("pr",  freq="month")

Summary of Changes

  • Add unit tests for temporal.average()
  • Rename previous .average() to .group_average()
  • Refactor many private methods in class TemporalAccessor
    • Rename methods and variables, update docstrings
    • Refactor ._validate_weights() for a significant performance increase of about 4-5x (for the monthly Dataset I was testing on)
  • Remove redundant tests for private methods

Related API docs: https://xcdat.readthedocs.io/en/feature-201-temporal-mean/generated/xarray.Dataset.temporal.average.html#

Floating point closeness comparison with CDAT/cdutil: https://github.com/xCDAT/xcdat_test/blob/pr-236-validation/validation/v1.0.0/temporal_average/qa_cdat_closeness.ipynb

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

- Update docstrings of methods in `TemporalAccessor` class
- Add `DEFAULT_SEASON_CONFIG` to reduce code duplication
- Add comments for sections of code that is opaque
- Rename `_averager()` to `_grouped_average()`
@tomvothecoder tomvothecoder self-assigned this May 10, 2022
@tomvothecoder tomvothecoder added this to In progress in v0.3.0 via automation May 10, 2022
@codecov-commenter
Copy link

codecov-commenter commented May 10, 2022

Codecov Report

Merging #236 (8447736) into main (1fdc8a9) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #236   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9         8    -1     
  Lines          742       723   -19     
=========================================
- Hits           742       723   -19     
Impacted Files Coverage Δ
xcdat/axis.py 100.00% <ø> (ø)
xcdat/utils.py 100.00% <ø> (ø)
xcdat/__init__.py 100.00% <100.00%> (ø)
xcdat/bounds.py 100.00% <100.00%> (ø)
xcdat/dataset.py 100.00% <100.00%> (ø)
xcdat/spatial.py 100.00% <100.00%> (ø)
xcdat/temporal.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a35d6a...8447736. Read the comment docs.

- Remove `data_var` arg from `_get_weights()`
- Update `_convert_df_to_dt()` to use regular dt objects for mean mode and non-monthly frequencies
Add `TemporalAccessor` attribute `_dim_name`
- Update docstrings
- Rename `.mean()` to `.average()`
- Rename existing `.average()` to `.group_average()`
- Update "mean" to "average" and "average" to "group_average in `Mode` type alias
- Rename `_grouped_average()` to `_group_average()`
- Rename `_time_grouped` to `_grouped_time`
- Update docstring for `_convert_df_to_dt()`
- Update conditional in `_convert_df_to_dt()`
- Update `_validate_weights()` to get `num_groups` from `_get_weights()`instead of `self._time_grouped`
- Rename classes in `test_temporal.py` to reflect private methods
- Add placeholder tests in `test_temporal.py`
Copy link
Collaborator Author

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review comments

xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
xcdat/temporal.py Outdated Show resolved Hide resolved
@chengzhuzhang chengzhuzhang self-requested a review May 12, 2022 20:22
@chengzhuzhang
Copy link
Collaborator

chengzhuzhang commented May 12, 2022

Hey Tom, just to follow up the xcdat meeting that we had. Regarding to two points:

  • new average (or mean old name) method returns weighted means for a data variable with time dimension removed
  • potential API naming inconsistencies: differentiate ds.temporal.average() vs ds.temporal.mean(): with the former for grouped average, later to collapse on time dimension.

I'm thinking one possibility to resolve both: is to make the new method an option of average()/group_average().
i.e.:

Assume that mean only average based on original freqency: for instance, if target frequency is not specified, the frequency should be the original data frequency (based on length of time interval), and returns means for the data variable with time dimension removed. if original data frequency is monthly or yearly, then the mean is weighted, otherwise, unweighted.

example:
ds_month = ds.temporal.average("ts", freq=None, center_times=False)
or
ds_month = ds.temporal.average("ts", center_times=False)

- Update `_group_time_coords()` to `_label_time_coords()`
- Update `_groupby_freq()` to `_group_data()`
- Extract `_get_dt_components()` from `_group_time_coords()`
- Rename `_process_season_dataframe()` to `_process_season_df()`
- Move `_convert_df_to_dt()` further down the class
- Rename `DATETIME_COMPONENTS` to `TIME_GROUPS`
- Update season frequency in `TIME_GROUPS`
- Add logic to `_get_dt_components()` to handle seasonal frequency
@tomvothecoder
Copy link
Collaborator Author

I'm thinking one possibility to resolve both: is to make the new method an option of average()/group_average().
i.e.:

Assume that mean only average based on original freqency: for instance, if target frequency is not specified, the frequency should be the original data frequency (based on length of time interval), and returns means for the data variable with time dimension removed. if original data frequency is monthly or yearly, then the mean is weighted, otherwise, unweighted.

Hi Jill, thanks for the suggestion. On the surface it seems like a relatively simple solution. It would be nice to have a single API that can easily solve the naming inconsistency!

After investigating more closely with a prototype implementation, I unfortunately found that combining these two different averaging operations under a single API makes the API less predictable because each operation produces a different output based on the freq argument.

For example, an end-user might expect a grouped average output, but they get a time collapsed average because they don't pass a freq argument (and vice versa). They might not understand why until they dig into the docs (which they should probably be reference first before using any API anyways haha). For the developer, we'd have to write more conditional logic based on the method arguments and more documentation.


My proposed (and current) solution is to have two well-defined APIs that serve different purposes because I think it is easier to use and can be cleanly implemented.

The name of the APIs/methods explains what they do, while their respective docstrings fills the gap in how to use them.

  1. .average() -- returns averages for a data variable and collapses the time dimension
    • Arguments include data_var, weighted, and center_times
    • I think we should include the weighted boolean arg so that the user explicitly declares whether or not output is weighted or not, rather than handling whether it is weighted internally based on the time frequency. It makes the output more predictable to the user because they know exactly what the output reflects.
  2. .group_average() -- returns grouped averages for a data variable
    • Arguments include data_var, freq, weighted, center_times, and season_config

As mentioned before, we will write an API translation table for CDAT users to help smooth the transition process to xCDAT.

Let me know what you think! Also tagging @lee1043 to see if he has any thoughts or ideas.

@chengzhuzhang
Copy link
Collaborator

@tomvothecoder Thank you for the effort to explore the different options! Appreciate it. What you suggested about two APIs actually make sense to me. Tagging @lee1043 or @pochedls to see if they have some input on this implementation or perhaps better API names?

@lee1043
Copy link
Collaborator

lee1043 commented May 13, 2022

@tomvothecoder @chengzhuzhang thank you for the discussion. I was initially leaning toward what @chengzhuzhang suggested earlier -- single API on top of two capabilities -- but I am leaning back to the well-defined 2 separate APIs. I think the point @tomvothecoder raised is very important -- make API predictable, which I think is one of key to be a successful community tool. While we expect users do READ docs first, we know it often not the case. In my personal experiences, I often start directly from some sample of the code, and read docs when I get stumbled. When API's predictability is great and I know what exactly API does, I can trust the API more. I think group_average is fine, and also open to any other suggestions.

- Testing private methods introduces coupling to implementation details. We should be testing public methods, which tests behaviors and outputs
- Use dataset custom dataset fixtures with less coordinate points and values other than 1 for easier and more robust testing
- Add tests for `season_config` arg
- Update names of tests
@tomvothecoder tomvothecoder changed the title Add weighted time averages with time dimension removed Add averages weighted by time with time dimension removed May 17, 2022
@tomvothecoder tomvothecoder added the breaking-change A breaking change, intended for a major release. label May 17, 2022
Copy link
Collaborator Author

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lee1043 and @chengzhuzhang, this PR is ready for review.

The PR's description should contain all of the relevant information to make the review process smooth. Thanks!

Comment on lines +1427 to 1440
def _validate_weights(self, weights: xr.DataArray, num_groups: int):
"""Validates that the sum of the weights for each group equals 1.0.

Parameters
----------
data_var : xr.DataArray
The data variable.
weights : xr.DataArray
The data variable's time coordinates weights.
The weights for the time coordinates.
num_groups : int
The number of groups.
"""
freq_groups = self._groupby_freq(data_var).count() # type: ignore
# Sum the frequency group counts by all the dims except the grouped time
# dimension to get a 1D array of counts.
summing_dims = tuple(
x for x in freq_groups.dims if x != self._time_grouped.name
)
freq_sums = freq_groups.sum(summing_dims)
actual_sum = self._group_data(weights).sum().values # type: ignore
expected_sum = np.ones(num_groups)

# Replace all non-zero counts with 1.0 (total weight of 100%).
expected_sum = np.where(freq_sums > 0, 1.0, freq_sums)
actual_sum = self._groupby_freq(weights).sum().values # type: ignore
np.testing.assert_allclose(actual_sum, expected_sum)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactored implementation of _validate_weights() speeds up weighted averages significantly.
The bottleneck was due to the call freq_groups=self_groupby_freq(data_var).count() for getting the count of unique grouping labels. Now we just pass num_groups directly to the method from _get_weights().

For the dataset I was testing on and a frequency of month, _get_weights() is now about 4-5x faster (~10.8-11.6 secs down to 2.8-3.2 seconds).

@@ -1,8 +1,5 @@
from datetime import datetime
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed a lot of the unit tests for private methods in test_temporal.py. I learned that testing private methods is unnecessary and introduces coupling to the implementation details, which breaks encapsulation. It makes refactoring significantly more tedious because the tests for the private methods need to also be updated (also risk of test duplication with public methods). Instead, tests should revolve around public methods and their behaviors/outputs.

More info on why testing private methods is considered "bad" practice:
https://softwareengineering.stackexchange.com/a/380296
https://stackoverflow.com/questions/105007/should-i-test-private-methods-or-only-public-ones

Comment on lines 161 to 259
def average(
self,
data_var: str,
freq: Frequency,
center_times: bool = False,
season_config: SeasonConfigInput = DEFAULT_SEASON_CONFIG,
):
"""
Returns a Dataset with the time weighted average of a data variable
and the time dimension removed.

This method is particularly useful for monthly or yearly time series
data since the number of days per month can vary based on the calendar
type, which can affect weighting. For unweighted time averages or other
frequencies, call xarray's native ``.mean()`` method on the data
variable instead.

Weights are calculated by first determining the length of time for
each coordinate point using the difference of its upper and lower
bounds. The time lengths are grouped, then each time length is
divided by the total sum of the time lengths to get the weight of
each coordinate point.

Parameters
----------
data_var: str
The key of the data variable for calculating averages
freq : Frequency
The time frequency for calculating weights.

* "year": weights by year
* "season": weights by season
* "month": weights by month
* "day": weights by day
* "hour": weights by hour

center_times: bool, optional
If True, center time coordinates using the midpoint between its
upper and lower bounds. Otherwise, use the provided time
coordinates by default False.

season_config: SeasonConfigInput, optional
A dictionary for "season" frequency configurations. If configs for
predefined seasons are passed, configs for custom seasons are
ignored and vice versa.

Configs for predefined seasons:

* "dec_mode" (Literal["DJF", "JFD"], by default "DJF")
The mode for the season that includes December.

* "DJF": season includes the previous year December.
* "JFD": season includes the same year December.
Xarray labels the season with December as "DJF", but it is
actually "JFD".

* "drop_incomplete_djf" (bool, by default False)
If the "dec_mode" is "DJF", this flag drops (True) or keeps
(False) time coordinates that fall under incomplete DJF seasons
Incomplete DJF seasons include the start year Jan/Feb and the
end year Dec.

Configs for custom seasons:

* "custom_seasons" ([List[List[str]]], by default None)
List of sublists containing month strings, with each sublist
representing a custom season.

* Month strings must be in the three letter format (e.g., 'Jan')
* Each month must be included once in a custom season
* Order of the months in each custom season does not matter
* Custom seasons can vary in length

>>> # Example of custom seasons in a three month format:
>>> custom_seasons = [
>>> ["Jan", "Feb", "Mar"], # "JanFebMar"
>>> ["Apr", "May", "Jun"], # "AprMayJun"
>>> ["Jul", "Aug", "Sep"], # "JunJulAug"
>>> ["Oct", "Nov", "Dec"], # "OctNovDec"
>>> ]

Returns
-------
xr.Dataset
Dataset with the time weighted average of the data variable and the
time dimension removed.

Examples
--------

Get weighted averages for a monthly time series data variable:

>>> ds_month = ds.temporal.average("ts", freq="month", center_times=False)
>>> ds_month.ts
"""
return self._averager(
data_var, "average", freq, True, center_times, season_config
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new average() method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder freq can be automatically assigned with a new parameter if weighted=True. Having freq seems can confuse with group_average.

@tomvothecoder
Copy link
Collaborator Author

Hi @lee1043 and @chengzhuzhang, this PR is ready for review.

The PR's description should contain all of the relevant information to make the review process smooth. Thanks!

@chengzhuzhang won't be able to review this PR this week. @lee1043 can you give this a shot whenever you get a chance? Thanks

@lee1043
Copy link
Collaborator

lee1043 commented May 18, 2022

I am testing this PR at this notebook. While I can give an overview at our regular meeting or we can have a separate chat, just sharing in advance for your interest. Basically what I am doing in the notebook is trying every option, even if that does not make sense, so I can learn how the function work and respond.

While I think I am understanding average() which returns temporal averaged with time coordinate collapses to 1 time step, and I know the intention of having freq there which is for weighting adjustment, but I think there is a chance that user can confused it with group average be seeing the "freq" option. I wonder if the function automatically recognize temporal interval of the given dataset and its data array, and user only decide weighted=True/False without dealing with the freq parameter, and the function automatically apply proer freq if weighted=True.

In the notebook, I also tried some silly things, such as get temporal average of monthly time series with freq=day or hour (which does not make sense but just for curious what would return) and compare their differences in Section 1.6. theoretically they might have to return the same value as the original monthly data because they don't have higher frequency input, but It is interesting to see how different they are and there are some patterns in the differences.

Regarding the group average, in the notebook I am using monthly time series as input (shape: (1872, Y, X)), so when I was doing ds_gra_month = ds.temporal.group_average("ts", freq="month") in Section 2.1, my logical sense was expecting a shape: (12, Y, X) (group same months together and get average of each group, like getting an annual cycle). However what returned was shape: (1872, Y, X), so same as the original one.

For year and season, it seems right numbers were returned for the time dimension.

So far I mostly checked number in time dimensions and yet to cross-validate returned values with CDAT. I will continue to work on that.

@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented May 19, 2022

Thanks for reviewing @lee1043! You have a great validation process going on.

While I think I am understanding average() which returns temporal averaged with time coordinate collapses to 1 time step, and I know the intention of having freq there which is for weighting adjustment, but I think there is a chance that user can confused it with group average be seeing the "freq" option. I wonder if the function automatically recognize temporal interval of the given dataset and its data array, and user only decide weighted=True/False without dealing with the freq parameter, and the function automatically apply proer freq if weighted=True.

I agree, and Jill mentioned a similar solution to what you outlined.

I think it might confuse users if there is a freq argument for specifying the frequency used to calculate weights. Users might unintentionally pass a frequency that is not aligned with the actual frequency of the time intervals, resulting in incorrect weighting.

I will try to figure out how cdutil recognizes the time interval. If cdutil's implementation isn't clear and/or robust, we can figure out our own implementation.

In the notebook, I also tried some silly things, such as get temporal average of monthly time series with freq=day or hour (which does not make sense but just for curious what would return) and compare their differences in Section 1.6. theoretically they might have to return the same value as the original monthly data because they don't have higher frequency input, but It is interesting to see how different they are and there are some patterns in the differences.

I noticed this too. This isn't necessarily an implementation bug, but if the user passes a frequency that isn't aligned with the actual frequency of the time intervals (e.g., freq="day" with monthly data), it can produce unintended odd behaviors.

For example, with the monthly time coordinates ['1850-01-16T12:00:00.000000000', '1850-02-15T00:00:00.000000000']:

  • If user passes freq="day": calculate weights using groups day 16 and day 15
  • If user passes freq="hour": calculates weights using groups hour 12 and hour 0

This is possible for the user if they want to calculate weights like that, but it doesn't really make sense. Removing the freq argument will eliminate the possibility of passing incorrect/unintended frequencies.

Regarding the group average, in the notebook I am using monthly time series as input (shape: (1872, Y, X)), so when I was doing ds_gra_month = ds.temporal.group_average("ts", freq="month") in Section 2.1, my logical sense was expecting a shape: (12, Y, X) (group same months together and get average of each group, like getting an annual cycle). However what returned was shape: (1872, Y, X), so same as the original one.

The grouping behaviors for group averages and climatologies are based on CDAT.

In the case of the monthly time series input, grouping on month will return the original data because freq=month groups on year and month.

Here is a code snippet to validate this behavior against CDAT:

import cdms2
import cdutil
import numpy as np
import xarray as xr

import xcdat

filepath = "/home/vo13/XCDAT/xcdat/input/demo_data/CMIP5_demo_data/pr_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc"

ds = xcdat.open_dataset(filepath)
ds_cdat = cdms2.openDataset(filepath)

pr_avg_cdat = cdutil.ANNUALCYCLE(ds_cdat["pr"])
# pr_avg_cdat.shape  == (1872, 145, 192)

pr_avg_xcdat = ds.temporal.group_average("pr", freq="month")["pr"]
# pr_avg_xcdat.shape == (1872, 145, 192)

The grouping behavior that produces (12, Y, X) is a result of calculating the climatology, not the group average:

pr_climo_cdat = cdutil.ANNUALCYCLE.climatology(ds_cdat["pr"])
pr_climo_cdat.shape == (12, 145, 192)

pr_climo_xcdat = ds.temporal.climatology("pr", freq="month")["pr"]
pr_climo_xcdat.shape == (12, 145, 192)

The code lines below show how grouping is done in xCDAT and CDAT, which was also validated in this notebook
https://github.com/xCDAT/xcdat_test/blob/master/validation/temporal_average/floating_point_comparisons.ipynb.

xcdat/xcdat/temporal.py

Lines 33 to 59 in da2cf9e

#: A dictionary mapping temporal averaging mode and frequency to the time groups.
TIME_GROUPS: Dict[Mode, Dict[Frequency, Tuple[DateTimeComponent, ...]]] = {
"average": {
"year": ("year",),
"season": ("season",),
"month": ("month",),
"day": ("day",),
"hour": ("hour",),
},
"group_average": {
"year": ("year",),
"season": ("year", "season"),
"month": ("year", "month"),
"day": ("year", "month", "day"),
"hour": ("year", "month", "day", "hour"),
},
"climatology": {
"season": ("season",),
"month": ("month",),
"day": ("month", "day"),
},
"departures": {
"season": ("season",),
"month": ("month",),
"day": ("month", "day"),
},
}

- Update `TestAverage` unit tests to test different time frequencies
Comment on lines +213 to +239
def _infer_freq(self) -> Frequency:
"""Infers the time frequency from the coordinates.

This method infers the time frequency from the coordinates by
calculating the minimum delta and comparing it against a set of
conditionals.

The native ``xr.infer_freq()`` method does not work for all cases
because the frequency can be irregular (e.g., different hour
measurements), which ends up returning None.

Returns
-------
Frequency
The time frequency.
"""
time_coords = self._dataset[self._dim_name]
min_delta = pd.to_timedelta(np.diff(time_coords).min(), unit="ns")

if min_delta < pd.Timedelta(days=1):
return "hour"
elif min_delta >= pd.Timedelta(days=1) and min_delta < pd.Timedelta(days=28):
return "day"
elif min_delta >= pd.Timedelta(days=28) and min_delta < pd.Timedelta(days=365):
return "month"
else:
return "year"
Copy link
Collaborator Author

@tomvothecoder tomvothecoder May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lee1043 I added this new method, _infer_freq() for inferring the freq of the coordinates. The frequency is used for calculating weights if weighed=True.

Let me know how the algorithm looks, thanks.

Copy link
Collaborator

@lee1043 lee1043 May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder I think this is great and very straightforward, thank you for revising it. I am playing with the new function in the notebook now.

P.S: typo in above comment weighed=True : weighed --> weighted

Comment on lines 942 to 969
def _average(self, data_var: xr.DataArray) -> xr.DataArray:
"""
Returns the weighted averages for a data variable with the time
dimension removed.

Parameters
----------
data_var : xr.DataArray
The data variable.

Returns
-------
xr.DataArray
The weighted averages for a data variable with the time dimension
removed.
"""
dv = data_var.copy()

This method groups the data variable's values by the time coordinates
and averages them with or without weights. The parameters for
``self._temporal_average()`` are stored as DataArray attributes in the
averaged data variable.
with xr.set_options(keep_attrs=True):
if self._weighted:
self._weights = self._get_weights()
dv = dv.weighted(self._weights).mean(dim=self._dim_name)
else:
dv = dv.mean(dim=self._dim_name)

dv = self._add_operation_attrs(dv)

return dv
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main method that performs the work for ds.temporal.average().

@lee1043
Copy link
Collaborator

lee1043 commented May 20, 2022

@tomvothecoder thank you for the revision. I have tested the function in following notebooks in this PR that are using

And I think the function is working as expected!

@tomvothecoder tomvothecoder changed the title Add averages weighted by time with time dimension removed Add averages with time dimension removed May 23, 2022
@tomvothecoder
Copy link
Collaborator Author

@lee1043 Great, thanks for adding those notebooks! I will do a final review and then merge.

@tomvothecoder tomvothecoder merged commit 112eb58 into main May 23, 2022
v0.3.0 automation moved this from In progress to Done May 23, 2022
@tomvothecoder tomvothecoder deleted the feature/201-temporal-mean branch May 23, 2022 17:53
@chengzhuzhang
Copy link
Collaborator

@lee1043 thank you for reviewing and testing this PR, also great to get more validation notebooks for it! At some point, we can also do a similar test as Steve's, to loop over all CMIP models and compare results between cdat and xcdat.

@tomvothecoder
Copy link
Collaborator Author

@chengzhuzhang I agree, I'll ask Steve to post his scripts so we can reuse them for validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change A breaking change, intended for a major release. type: enhancement New enhancement request
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Add averages with time dimension removed
4 participants