Skip to content

Update schism.data to correctly handle DataBlob #159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion rompy/core/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def get(self, destdir: Union[str, Path], name: str = None, *args, **kwargs) -> P
SOURCE_TYPES_TS = load_entry_points("rompy.source", etype="timeseries")


class DataPoint(DataBlob):
class DataPoint(RompyBaseModel):
"""Data object for timeseries source data.

Generic data object for xarray datasets that only have time as a dimension and do
Expand Down
27 changes: 21 additions & 6 deletions rompy/schism/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,18 +261,29 @@ def get(

"""
ret = {}
destdir = Path(destdir) / "sflux"
destdir = Path(destdir)
destdir.mkdir(parents=True, exist_ok=True)
namelistargs = {}
anydatablobs = False
for variable in ["air_1", "air_2", "rad_1", "rad_2", "prc_1", "prc_2"]:
data = getattr(self, variable)
if data is None:
continue
data.id = variable
logger.info(f"Fetching {variable}")
namelistargs.update(data.namelist)
ret[variable] = data.get(destdir, grid, time)
ret["nml"] = Sflux_Inputs(**namelistargs).write_nml(destdir)
if isinstance(data, DataBlob):
anydatablobs = True
ret[variable] = data.get(destdir, name='sflux')
existing_nml = ret[variable] / 'sflux_inputs.txt'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 276, a datablob data.get(destdir, name='sflux') returns either a file or a directory based on data.source but on Line 277, ret[variable] is expected to be a directory in that it is used to define existing_nml as an sflux_inputs.txt in that directory location. That may not be a problem if datablob is always a directory in this case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there has to be an sflux_inputs.txt which describes to some extent what the sflux netcdf files contain. If the files are already generated it is assumed this file.

else:
dd = destdir / "sflux"
dd.mkdir(parents=True, exist_ok=True)
ret[variable] = data.get(dd, grid, time)
namelistargs.update(data.namelist)
if anydatablobs:
ret["nml"] = existing_nml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And with line 276 and 277 this also means you can't mix and match datablobs with other data objects like say some sfluxair datablobs and with sfluxrad data objects. That might be fine but it would be nice to inform the user somehow they need either an all or nothing approach with using datablobs in sflux

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively I wonder whether ret["nml"] = Sflux_Inputs(**namelistargs).write_nml(destdir) would work even if some of the inputs are datablobs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a bit poor, but without interrogating the sflux_input.txt file you dont know how to populate https://github.com/rom-py/rompy/blob/main/rompy/schism/namelists/sflux.py#L8
i.e. **namelistargs is empty for DataBlobs. We could write a parser, but that isnt something done elsewhere to parse model specific input files back into memory.

else:
ret["nml"] = Sflux_Inputs(**namelistargs).write_nml(destdir)
return ret

@model_validator(mode="after")
Expand All @@ -287,11 +298,15 @@ def check_weights(v):
ValueError: If the relative weights for any variable do not add up to 1.0.

"""

for variable in ["air", "rad", "prc"]:
weight = 0
active = False
for i in [1, 2]:
data = getattr(v, f"{variable}_{i}")
# Check if DataBlob is used
if isinstance(data, DataBlob):
continue
if data is None:
continue
if data.fail_if_missing:
Expand Down Expand Up @@ -759,8 +774,8 @@ def get(
for datatype in ["atmos", "ocean", "wave", "tides"]:
data = getattr(self, datatype)
if data is None:
continue
if type(data) is DataBlob:
output = None
elif type(data) is DataBlob:
output = data.get(destdir)
else:
output = data.get(destdir, grid, time)
Expand Down