Skip to content

cost_distance leaks dask task name as output .name on dask backends #3344

@brendancol

Description

@brendancol

Describe the bug

cost_distance returns a DataArray whose .name depends on which backend ran. The numpy and cupy backends return name=None, but the dask+numpy and dask+cupy backends leak an internal dask graph name as the user-visible .name (e.g. _trim-052c0535... from map_overlap, or asarray-09cc... from the dask+cupy path). The leaked name is a random per-run token, so it isn't even stable between runs.

The public cost_distance() ends with:

return xr.DataArray(
    result_data,
    coords=raster.coords,
    dims=raster.dims,
    attrs=raster.attrs,
)

No name= is passed. On the numpy and cupy paths result_data is a plain array, so xarray sets .name to None. On the dask paths result_data carries the dask graph's internal name, and xarray surfaces that as .name.

Expected behavior

.name should be the same across all four backends. The closest peer function, proximity, returns name=None on every backend. cost_distance should match the input raster's name (which is None for an unnamed input) regardless of backend.

Reproduce

import numpy as np, xarray as xr, dask.array as da
from xrspatial.cost_distance import cost_distance

data = np.zeros((6, 6)); data[2, 2] = 1
fric = np.ones((6, 6))
r = xr.DataArray(data, dims=['y', 'x'], attrs={'res': (1.0, 1.0)})
r['y'] = np.arange(6.0); r['x'] = np.arange(6.0)
f = xr.DataArray(fric, dims=['y', 'x']); f['y'] = r['y']; f['x'] = r['x']
r.data = da.from_array(r.data, chunks=(3, 3))
f.data = da.from_array(f.data, chunks=(3, 3))

out = cost_distance(r, f, max_cost=3.0)
print(out.name)   # '_trim-052c0535...' instead of None

Impact

Pipelines that key on .name (for example .to_dataset(), which uses .name as the variable key) get a different, nondeterministic result on the dask backends. That breaks reproducibility and any name-based logic downstream.

Fix

Force the result name to match the input by calling .rename(raster.name) on the returned DataArray. Passing name=raster.name to the constructor isn't enough: xarray ignores name=None and lets the dask token leak through, whereas .rename(None) does force the name to None.

Found by a metadata-propagation sweep (Cat 5: backend-inconsistent metadata).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions