You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @kieranricardo and @Kirill888, I've just had a chance to test out the new xr_percentile function. It seems to function far faster than the built-in .quantile functionality in xarray, which is really exciting! I did have a few pieces of feedback though which might help it be easier to use and to fit into existing workflows.
Issue 1: The function currently returns an xr.Dataset with a data variable for each percentile the user requests. E.g. below, I've requested a 0.01 and a 0.999 percentile. These are used to label the new variables by appending the percentiles onto the original variable name.
This approach feels a bit clunky to to me as the user can't anticipate the naming convention used for the new band names, especially as the input 0.01 value is converted to an integer 1. It also produces issues where the user requests higher than 0.01 precision, e.g. the 0.999 above is clipped to 0.99 in the band name (e.g. if a user requests both 0.995 and 0.999, they get combined into one in the output).
The native xarray solution is to instead produce an xr.Dataset with a new "quantile" dimension, which is then labelled with the requested quantiles. This feels a bit more elegant to me:
Issue 2: It would be nice if this function also supported non-dask input data too, as for a lot of smaller-scale science team applications we don't always need to use dask throughout an entire workflow. If I try running it on a xr.Dataset in memory, I get this error:
The text was updated successfully, but these errors were encountered:
Hey @kieranricardo and @Kirill888, I've just had a chance to test out the new
xr_percentile
function. It seems to function far faster than the built-in.quantile
functionality inxarray
, which is really exciting! I did have a few pieces of feedback though which might help it be easier to use and to fit into existing workflows.Issue 1: The function currently returns an
xr.Dataset
with a data variable for each percentile the user requests. E.g. below, I've requested a0.01
and a0.999
percentile. These are used to label the new variables by appending the percentiles onto the original variable name.This approach feels a bit clunky to to me as the user can't anticipate the naming convention used for the new band names, especially as the input
0.01
value is converted to an integer1
. It also produces issues where the user requests higher than 0.01 precision, e.g. the0.999
above is clipped to0.99
in the band name (e.g. if a user requests both0.995
and0.999
, they get combined into one in the output).The native
xarray
solution is to instead produce anxr.Dataset
with a new "quantile" dimension, which is then labelled with the requested quantiles. This feels a bit more elegant to me:Issue 2: It would be nice if this function also supported non-dask input data too, as for a lot of smaller-scale science team applications we don't always need to use dask throughout an entire workflow. If I try running it on a
xr.Dataset
in memory, I get this error:The text was updated successfully, but these errors were encountered: