Fancy isel on LazilyIndexedArray allocates memory of about the size of backing dataset

### What happened?

I ran into a large memory allocation when doing an isel on a zarr dataset with dask turned off (ie opened with `open_dataset("mydataset.zarr", chunks=None)`), which caused the Linux OOM killer to kill my process. 

I used some memory profilers (memray + scalene) to dig in a little; it appears that a fancy isel on a `LazilyIndexedArray` ultimately causes `indexing._combine_indexers` to run `np.brodcast_arrays(…)` on a set of tuples which causes the broadcast to return an numpy array of the same shape as the underlying dataset. See MCVE for an example isel which I think tries to allocate exabytes of ram. (Not loading the data; when doing the lazy isel.)

The VectorizedIndexer created at the end of `_combine_indexers` also ends up being made of ndarrays of the size of the data-to-be-selected which could also be large.

### What did you expect to happen?

I was hoping that I could index into a large dataset cheaply before loading it (eg by tracking ranges to be selected as slices). The original example and the one below both try to select a relatively small amount of the total data. 

### Minimal Complete Verifiable Example

```Python
import xarray as xr
from xarray.core import indexing
import numpy as np

def test_dataset_indexing_memory() -> None:
    # Mimics open_dataset with a large backing array and no dask.
    big_backing_array = indexing.LazilyIndexedArray(
        np.broadcast_to([1], (1_000_000, 1_000_000, 1_000_000))
    )
    data_array = xr.DataArray(
        big_backing_array,
        coords={
            "t": np.arange(1_000_000),
            "y": np.arange(1_000_000),
            "z": np.arange(1_000_000),
        },
    )
    print("running isel")
    # I think any fancy isel will do
    data_array.isel(t=xr.Variable(("new", "t"), [[0]]))
```

### MVCE confirmation

- [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [x] Complete example — the example is self-contained, including all data and the text of any traceback.
- [x] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

```Python

```

### Anything else we need to know?

Happy to share memory profiles or other debugging info if that would be helpful. Thank you for taking a look!

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: c8affb3c17769121a3a9895f8cfad6ed137a6e0f
python: 3.10.17 | packaged by conda-forge | (main, Apr 10 2025, 22:23:34) [Clang 18.1.8 ]
python-bits: 64
OS: Darwin
OS-release: 24.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.4
libnetcdf: 4.9.2

xarray: 2025.4.1.dev1+g070af11f
pandas: 2.2.3
numpy: 2.2.5
scipy: 1.15.2
netCDF4: 1.7.2
pydap: 3.5.5
h5netcdf: 1.6.1
h5py: 3.12.1
zarr: 2.18.3
cftime: 1.6.4
nc_time_axis: 1.4.1
iris: 3.11.1
bottleneck: 1.4.2
dask: 2024.8.2
distributed: 2024.8.2
matplotlib: 3.10.1
cartopy: 0.24.0
seaborn: 0.13.2
numbagg: 0.9.0
fsspec: 2025.3.2
cupy: None
pint: None
sparse: 0.16.0
flox: 0.10.1
numpy_groupies: 0.11.2
setuptools: 80.1.0
pip: 25.1
conda: 24.9.2
pytest: 8.3.5
mypy: 1.15.0
IPython: None
sphinx: None


</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

Fancy isel on LazilyIndexedArray allocates memory of about the size of backing dataset #10311

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

	return VectorizedIndexer(
	tuple(o[new_key.tuple] for o in np.broadcast_arrays(*old_key.tuple))
	)

Fancy isel on LazilyIndexedArray allocates memory of about the size of backing dataset #10311

Description

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

INSTALLED VERSIONS

Activity

dcherian commented on May 28, 2025

jder commented on May 28, 2025

dcherian commented on May 28, 2025

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions