Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: CDAT migration: AttributeError: 'CoreParameter' object has no attribute 'test_start_yr' #877

Closed
chengzhuzhang opened this issue Oct 25, 2024 · 6 comments
Labels
bug Bug fix (will increment patch version)

Comments

@chengzhuzhang
Copy link
Contributor

What happened?

When testing a model vs obs run when I tried to specify the start and end year for reference data, I ran into AttributeError for all parameter objects, example: AttributeError: 'CoreParameter' object has no attribute 'test_start_yr'

The same script works okay for main branch.

What did you expect to happen? Are there are possible answers you came across?

No response

Minimal Complete Verifiable Example (MVCE)

import os
from e3sm_diags.parameter.core_parameter import CoreParameter
from e3sm_diags.run import runner

param = CoreParameter()

param.reference_data_path = '/global/cfs/cdirs/e3sm/diagnostics/observations/Atm/time-series'
param.test_data_path = '/global/cfs/cdirs/e3sm/chengzhu/eamxx/post/data/rgr'
param.test_name = 'eamxx_decadal'
param.seasons = ["ANN"]
#param.save_netcdf = True

param.ref_timeseries_input = True
# Years to slice the ref data, base this off the years in the filenames.
param.ref_start_yr = "1996"
param.ref_end_yr = "1996"

prefix = '/global/cfs/cdirs/e3sm/www/zhang40/tests/eamxx'
param.results_dir = os.path.join(prefix, 'eamxx_decadal_1996_1024_edv2')

runner.sets_to_run = ["lat_lon",
        "zonal_mean_xy",
        "zonal_mean_2d",
        "zonal_mean_2d_stratosphere",
        "polar",
        "cosp_histogram",
        "meridional_mean_2d",
        "annual_cycle_zonal_mean",]

runner.run_diags([param])

Relevant log output

(/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy) chengzhu@perlmutter:login35:~/eamxx_diags/run_script> python run_e3sm_diags_1996.py 
2024-10-25 14:41:09,034 [INFO]: e3sm_diags_driver.py(_save_env_yml:58) >> Saved environment yml file to: /global/cfs/cdirs/e3sm/www/zhang40/tests/eamxx/eamxx_decadal_1996_1024_edv3/prov/environment.yml
2024-10-25 14:41:13,375 [INFO]: e3sm_diags_driver.py(_save_parameter_files:69) >> Saved command used to: /global/cfs/cdirs/e3sm/www/zhang40/tests/eamxx/eamxx_decadal_1996_1024_edv3/prov/cmd_used.txt
2024-10-25 14:41:13,384 [INFO]: e3sm_diags_driver.py(_save_python_script:133) >> Saved Python script to: /global/cfs/cdirs/e3sm/www/zhang40/tests/eamxx/eamxx_decadal_1996_1024_edv3/prov/run_e3sm_diags_1996.py
2024-10-25 14:41:15,655 [ERROR]: core_parameter.py(_run_diag:343) >> Error in e3sm_diags.driver.lat_lon_driver
Traceback (most recent call last):
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/parameter/core_parameter.py", line 340, in _run_diag
    single_result = module.run_diag(self)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/lat_lon_driver.py", line 65, in run_diag
    test_ds = Dataset(parameter, data_type="test")
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/utils/dataset_xr.py", line 113, in __init__
    self.start_yr = self.parameter.test_start_yr  # type: ignore
AttributeError: 'CoreParameter' object has no attribute 'test_start_yr'
2024-10-25 14:41:15,696 [ERROR]: core_parameter.py(_run_diag:343) >> Error in e3sm_diags.driver.lat_lon_driver
Traceback (most recent call last):
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/parameter/core_parameter.py", line 340, in _run_diag
    single_result = module.run_diag(self)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/lat_lon_driver.py", line 65, in run_diag
    test_ds = Dataset(parameter, data_type="test")
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/utils/dataset_xr.py", line 113, in __init__
    self.start_yr = self.parameter.test_start_yr  # type: ignore
AttributeError: 'CoreParameter' object has no attribute 'test_start_yr'

Anything else we need to know?

No response

Environment

The latest cdat-migration-fy24 branch.

@chengzhuzhang chengzhuzhang added the bug Bug fix (will increment patch version) label Oct 25, 2024
@tomvothecoder
Copy link
Collaborator

Thanks for reporting this. I will take a look today.

@tomvothecoder
Copy link
Collaborator

I think the issue is that on cdat-migration-fy24, the self.start_yr and self.end_yr are attributes that always expect valid values:

# If the underlying data is a time series, set the `start_yr` and
# `end_yr` attrs based on the data type (ref or test). Note, these attrs
# are different for the `area_mean_time_series` parameter.
if self.is_time_series:
# FIXME: This conditional should not assume the first set is
# area_mean_time_series. If area_mean_time_series is at another
# index, this conditional is not False.
if self.parameter.sets[0] in ["area_mean_time_series"]:
self.start_yr = self.parameter.start_yr # type: ignore
self.end_yr = self.parameter.end_yr # type: ignore
elif self.data_type == "ref":
self.start_yr = self.parameter.ref_start_yr # type: ignore
self.end_yr = self.parameter.ref_end_yr # type: ignore
elif self.data_type == "test":
self.start_yr = self.parameter.test_start_yr # type: ignore
self.end_yr = self.parameter.test_end_yr # type: ignore

While on main, they are not class attributes in dataset.py but rather only returned in the function below. It also uses getattr() which returns None as a default value.

def get_start_and_end_years(self):
"""
Get the user-defined start and end years.
"""
sub_monthly = False
if self.parameters.sets[0] in ["area_mean_time_series"]:
start_yr = getattr(self.parameters, "start_yr")
end_yr = getattr(self.parameters, "end_yr")
else:
if self.ref:
start_yr = getattr(self.parameters, "ref_start_yr")
end_yr = getattr(self.parameters, "ref_end_yr")
else:
start_yr = getattr(self.parameters, "test_start_yr")
end_yr = getattr(self.parameters, "test_end_yr")
if self.parameters.sets[0] in ["diurnal_cycle", "arm_diags"]:
sub_monthly = True
return start_yr, end_yr, sub_monthly

Possible Workaround

Use getattr() for CoreParameter attributes in the code block below:

# If the underlying data is a time series, set the `start_yr` and
# `end_yr` attrs based on the data type (ref or test). Note, these attrs
# are different for the `area_mean_time_series` parameter.
if self.is_time_series:
# FIXME: This conditional should not assume the first set is
# area_mean_time_series. If area_mean_time_series is at another
# index, this conditional is not False.
if self.parameter.sets[0] in ["area_mean_time_series"]:
self.start_yr = self.parameter.start_yr # type: ignore
self.end_yr = self.parameter.end_yr # type: ignore
elif self.data_type == "ref":
self.start_yr = self.parameter.ref_start_yr # type: ignore
self.end_yr = self.parameter.ref_end_yr # type: ignore
elif self.data_type == "test":
self.start_yr = self.parameter.test_start_yr # type: ignore
self.end_yr = self.parameter.test_end_yr # type: ignore

@tomvothecoder
Copy link
Collaborator

It also uses getattr() which returns None as a default value.

Nevermind, getattr() does not automatically return None as a default value (has to be explicitly set).

I think on main, self.parameter.test_start_yr and self.parameter.test_end_yr are being set earlier before self.start_yr and self.end_yr are set so it works. More debugging needed.

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Oct 28, 2024

Findings:

  • On main:
    • The test_data object is considered is_climo because it is self.test and test_timeseries_input=False (logic below). Later on, the ref climo variable can't be found so it is set to the test climo variable (model-only run).
    • def is_timeseries(self):
      """
      Return True if this dataset is for timeseries data.
      """
      if self.ref:
      return getattr(self.parameters, "ref_timeseries_input", False)
      else:
      return getattr(self.parameters, "test_timeseries_input", False)
  • On cdat-migration-fy24:
    • The test_ds object is considered is_time_series because it is ref_timeseries_input=True in the run script:
    • @property
      def is_time_series(self):
      if self.parameter.ref_timeseries_input or self.parameter.test_timeseries_input:
      return True
      else:
      return False

Question:

  1. Should test_data/test_ds be considered is_time_series? And if yes, should test_timeseries_input=True in the run script?
  2. Or should we update the code on cdat-migration-fy24 to align with main so that the test_ds is considered is_climo?

@chengzhuzhang
Copy link
Contributor Author

chengzhuzhang commented Oct 28, 2024

@tomvothecoder thank you for troubleshooting. I think the logic on main is correct.
The test/ref_timeseries_input are default to False.

if test/ref_timeseries_input is set to True, then the corresponding test/ref is considered is_time_series.

if a dataset is considered to is_time_series or not should be independent of the other dataset.

@tomvothecoder
Copy link
Collaborator

Closed by #881

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug fix (will increment patch version)
Projects
None yet
Development

No branches or pull requests

2 participants