Reason or Problem
contours() transforms segment coordinates from pixel-space to DataArray coordinate-space at lines 792-795. This creates a second copy of every coordinate pair after the merge/dedup step already produced final arrays.
Proposal
contours() at xrspatial/contour.py:792-795:
out = np.empty_like(coords)
out[:, 0] = np.interp(coords[:, 0], y_idx, y_coords)
out[:, 1] = np.interp(coords[:, 1], x_idx, x_coords)
np.empty_like allocates a fresh array identical in shape/dtype to coords, then np.interp fills it. For large rasters with many contour segments this doubles peak memory during the transform phase. Several options:
- Write the interpolated values in-place into
coords (numpy does not support in-place interp, but the array could be re-used).
- Chain
np.interp output into np.array() without the intermediate np.empty_like — np.interp already returns a new array, so np.empty_like is a redundant allocation whose result is immediately overwritten.
Value: Reduces peak memory for large contour outputs by ~half during the coordinate-transform phase, keeping more headroom for the merge step.
Stakeholders and Impacts
Performance-sensitive users processing large DEMs with return_type="geopandas" would see lower peak memory. No API change; drop-in.
Drawbacks
Tight coupling to numpy internals (relies on np.interp always returning a new array). Trivial code clarity cost.
Alternatives
- Accept the copy as-is (current state).
- Benchmark whether the copy is materialized on a 30TB raster before changing.
Unresolved Questions
Should the in-place path test for np.may_share_memory before mutating, or always allocate fresh on the dask path where the segment buffer might be reused downstream?
Reason or Problem
contours()transforms segment coordinates from pixel-space to DataArray coordinate-space at lines 792-795. This creates a second copy of every coordinate pair after the merge/dedup step already produced final arrays.Proposal
contours()atxrspatial/contour.py:792-795:np.empty_likeallocates a fresh array identical in shape/dtype tocoords, thennp.interpfills it. For large rasters with many contour segments this doubles peak memory during the transform phase. Several options:coords(numpy does not support in-place interp, but the array could be re-used).np.interpoutput intonp.array()without the intermediatenp.empty_like—np.interpalready returns a new array, sonp.empty_likeis a redundant allocation whose result is immediately overwritten.Value: Reduces peak memory for large contour outputs by ~half during the coordinate-transform phase, keeping more headroom for the merge step.
Stakeholders and Impacts
Performance-sensitive users processing large DEMs with
return_type="geopandas"would see lower peak memory. No API change; drop-in.Drawbacks
Tight coupling to numpy internals (relies on np.interp always returning a new array). Trivial code clarity cost.
Alternatives
Unresolved Questions
Should the in-place path test for
np.may_share_memorybefore mutating, or always allocate fresh on the dask path where the segment buffer might be reused downstream?