You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When incorporating active storage into Dask, the Dask graph needs to be modified non-lazily to account for the fact that some of the work is being done externally to Dask, i.e. on the server where the data is [*].
If cf-python thinks that active storage operation are not possible, then it simply doesn't modify the graph. There are many reasons why active storage is not deemed OK, such as the data has already been operated on (f += 2), the chunks to not point to files on disk, but the relevant one here is that the file reside at a location that doesn't support active reductions.
So, it would be great to have a method of Active that can tell us if a given file can be reduced actively, something like (notional API):
>>>a=Active('/path/to/file.nc')
>>>a.isactive()
True# or False
I imagine that this could entail some try ... except ... approach whereby we assume the file is active, send off some mofided URI that returns True iff it is possible.
[*] (The detail of this is that the chunk reduction function used by dask.array.reduction needs to be changed from the usual function that expects to do some work (e.g. np.max) to the identity function that does no work (e.g. lambda x: x). This has to be done prior to the compute().)
The text was updated successfully, but these errors were encountered:
davidhassell
changed the title
Method query if active storage reductions are available.
Method to query if active storage reductions are available.
Mar 2, 2023
When incorporating active storage into Dask, the Dask graph needs to be modified non-lazily to account for the fact that some of the work is being done externally to Dask, i.e. on the server where the data is [*].
If cf-python thinks that active storage operation are not possible, then it simply doesn't modify the graph. There are many reasons why active storage is not deemed OK, such as the data has already been operated on (
f += 2
), the chunks to not point to files on disk, but the relevant one here is that the file reside at a location that doesn't support active reductions.So, it would be great to have a method of
Active
that can tell us if a given file can be reduced actively, something like (notional API):I imagine that this could entail some try ... except ... approach whereby we assume the file is active, send off some mofided URI that returns True iff it is possible.
[*] (The detail of this is that the chunk reduction function used by
dask.array.reduction
needs to be changed from the usual function that expects to do some work (e.g.np.max
) to the identity function that does no work (e.g.lambda x: x
). This has to be done prior to thecompute()
.)The text was updated successfully, but these errors were encountered: