Possible improvements to new xr_percentile function #209

robbibt · 2021-06-03T01:47:45Z

Hey @kieranricardo and @Kirill888, I've just had a chance to test out the new xr_percentile function. It seems to function far faster than the built-in .quantile functionality in xarray, which is really exciting! I did have a few pieces of feedback though which might help it be easier to use and to fit into existing workflows.

Issue 1: The function currently returns an xr.Dataset with a data variable for each percentile the user requests. E.g. below, I've requested a 0.01 and a 0.999 percentile. These are used to label the new variables by appending the percentiles onto the original variable name.

This approach feels a bit clunky to to me as the user can't anticipate the naming convention used for the new band names, especially as the input 0.01 value is converted to an integer 1. It also produces issues where the user requests higher than 0.01 precision, e.g. the 0.999 above is clipped to 0.99 in the band name (e.g. if a user requests both 0.995 and 0.999, they get combined into one in the output).

The native xarray solution is to instead produce an xr.Dataset with a new "quantile" dimension, which is then labelled with the requested quantiles. This feels a bit more elegant to me:

Issue 2: It would be nice if this function also supported non-dask input data too, as for a lot of smaller-scale science team applications we don't always need to use dask throughout an entire workflow. If I try running it on a xr.Dataset in memory, I get this error:

The text was updated successfully, but these errors were encountered:

Kirill888 · 2021-06-03T01:49:50Z

Agree on both counts.

RichardScottOZ · 2021-06-23T22:54:04Z

This sounds great given quantile doesn't work on dask I think?

RichardScottOZ · 2021-06-24T00:22:28Z

and a few hundred times faster than this would be great:-
3 variables, 44 bands, 100 million pixels each approx

kieranricardo mentioned this issue Jul 9, 2021

Xarray percentile - support non-dask input and percentiles in a new #248

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible improvements to new xr_percentile function #209

Possible improvements to new xr_percentile function #209

robbibt commented Jun 3, 2021

Kirill888 commented Jun 3, 2021

RichardScottOZ commented Jun 23, 2021

RichardScottOZ commented Jun 24, 2021

Possible improvements to new xr_percentile function #209

Possible improvements to new xr_percentile function #209

Comments

robbibt commented Jun 3, 2021

Kirill888 commented Jun 3, 2021

RichardScottOZ commented Jun 23, 2021

RichardScottOZ commented Jun 24, 2021