map_over_datasets throws error on nodes without datasets

`map_over_datasets` -- a way to compute over datatrees -- currently seems to try an operate even on nodes which contain no datasets, and consequently raises an error. 
*This seems to be a new issue, and was not a problem when this function was called `map_over_subtree`, which was part of the experimental datatree versions.*

An example to reproduce this problem is below: 
```
## Generate datatree, using example from documentation
def time_stamps(n_samples, T):
    """Create an array of evenly-spaced time stamps"""
    return xr.DataArray(
        data=np.linspace(0, 2 * np.pi * T, n_samples), dims=["time"]
    )


def signal_generator(t, f, A, phase):
    """Generate an example electrical-like waveform"""
    return A * np.sin(f * t.data + phase)


time_stamps1 = time_stamps(n_samples=15, T=1.5)

time_stamps2 = time_stamps(n_samples=10, T=1.0)

voltages = xr.DataTree.from_dict(
    {
        "/oscilloscope1": xr.Dataset(
            {
                "potential": (
                    "time",
                    signal_generator(time_stamps1, f=2, A=1.2, phase=0.5),
                ),
                "current": (
                    "time",
                    signal_generator(time_stamps1, f=2, A=1.2, phase=1),
                ),
            },
            coords={"time": time_stamps1},
        ),
        "/oscilloscope2": xr.Dataset(
            {
                "potential": (
                    "time",
                    signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.2),
                ),
                "current": (
                    "time",
                    signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7),
                ),
            },
            coords={"time": time_stamps2},
        ),
    }
)

## Write some function to add resistance
def add_resistance_only_do(dtree): 
    def calculate_resistance(ds):
        ds_new = ds.copy()
        
        ds_new['resistance'] = ds_new['potential']/ds_new['current']
        return ds_new 
        
    dtree = dtree.map_over_datasets(calculate_resistance)
    
    return dtree
    
def add_resistance_try(dtree): 
    def calculate_resistance(ds):
        ds_new = ds.copy()
        try:
            ds_new['resistance'] = ds_new['potential']/ds_new['current']
            return ds_new 
        except:
            return ds_new

    dtree = dtree.map_over_datasets(calculate_resistance)
    
    return dtree
```

Calling `voltages = add_resistance_only_do(voltages)` raises the error:
```
KeyError: "No variable named 'potential'. Variables on the dataset include []"
Raised whilst mapping function over node with path '.'
```
This can be easily resolved by putting try statements in (e.g. `voltages = add_resistance_try(voltages)`), but we know that Yoda would not recommend try (right @TomNicholas). 

Can this be built in as a default feature of `map_over_datasets`? as many examples of datatree will have nodes without datasets. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

map_over_datasets throws error on nodes without datasets #9693

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

map_over_datasets throws error on nodes without datasets #9693

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions