[Refactor]: CDAT Migration Phase 3: testing and documentation update #846

tomvothecoder · 2024-09-05T22:50:17Z

Description

Closes [Refactor]: CDAT Migration Phase 3: testing, documentation update #843

1. Examples

In the root examples/ directory, there are example run scripts that use the legacy way of configuring and running E3SM diagnostics. They should all be refactored to reflect the latest changes.

Tasks

2. Run script with `model_vs_model`

Tasks

Test run script with model_vs_model run
- Replace get_ref_climo_dataset() in all sets except lat_lon
- Refactor lat_lon_driver.py to handle model-only runs more cleanly, also fixes issue where model-only plots would incorrectly generate obs plot and diff plot (which CDAT code does not do)
- 124/128 variables within rtol
- 4/128 not within rtol, but this is due to a bug on main ([Bug]: Silent bug in adjust_prs_val_units() conditional where if prs_val0: will be False if prs_val0 is 0 #797)
- Some files are missing on main (refer to [Bug]: model_vs_model lat_lon variables not generated on main but generated on cdat-migration-fy24 #854)
Re-run model_vs_obs with above change and perform regression
- After omitting specific files due to known issues (e.g., nan mismatch due to regridding):
  - 1213/1215 variables within rtol
  - 2/1215 variables not within rtol due to <=8/64800 elements mismatching (0.04% and 1.2% max rel diff) -- not a concern

3. Documentation

Tasks

Update /tutorials as needed -- none needed
Update /tutorials/2024 as needed -- none needed
Delete all run_...py scripts (only keep run_v2_9_0_all_sets_E3SM_machines.py
Rename run_v2_9_0_all_sets_E3SM_machines.py to run_all_sets_E3SM_machines.py
Update documentation to reflect any changes in the refactored codebase
- CDP references

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules

If applicable:

New and existing unit tests pass with my changes (locally and CI/CD build)
I have added tests that prove my fix is effective or that my feature works
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

- Delete outdated run scripts

tomvothecoder · 2024-09-05T23:05:53Z

@chengzhuzhang The following variables failed in ex5.py and ex6.py:

['/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex5_model_to_obs/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_9.4_ISCCP-ANN-global_ref.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex5_model_to_obs/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_9.4_ISCCP-ANN-global_test.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex5_model_to_obs/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_ISCCP-ANN-global_ref.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex5_model_to_obs/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_ISCCP-ANN-global_test.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex5_model_to_obs/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU9.4_ISCCP-ANN-global_ref.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex5_model_to_obs/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU9.4_ISCCP-ANN-global_test.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex6_zonal_mean_2d_and_lat_lon_demo/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_9.4_ISCCP-ANN-global_ref.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex6_zonal_mean_2d_and_lat_lon_demo/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_9.4_ISCCP-ANN-global_test.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex6_zonal_mean_2d_and_lat_lon_demo/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_ISCCP-ANN-global_ref.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex6_zonal_mean_2d_and_lat_lon_demo/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU1.3_ISCCP-ANN-global_test.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex6_zonal_mean_2d_and_lat_lon_demo/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU9.4_ISCCP-ANN-global_ref.nc',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/examples-dev/ex6_zonal_mean_2d_and_lat_lon_demo/lat_lon/Cloud ISCCP/ISCCPCOSP-CLDTOT_TAU9.4_ISCCP-ANN-global_test.nc']

Root Cause

I found that the time attributes for this file is bad, which breaks decoding (decode_times=True) with xCDAT/Xarray:
/global/cfs/cdirs/e3sm/e3sm_diags/obs_for_e3sm_diags/climatology/ISCCPCOSP/ISCCPCOSP_ANN_climo.nc

Checking time attributes (with `decode_times=False`)

<xarray.DataArray 'time' (time: 1)> Size: 4B
array([150.5], dtype=float32)
Coordinates:
  * time     (time) float32 4B 150.5
Attributes:
    long_name:  Time
    units:      months since 1983-06

Breaks with `decode_times=True`

import xarray as xr
import xcdat as xc

filepath = "/global/cfs/cdirs/e3sm/e3sm_diags/obs_for_e3sm_diags/climatology/ISCCPCOSP/ISCCPCOSP_ANN_climo.nc"

ds_xc = xc.open_dataset(filepath)
# ValueError: Non-integer years and months are ambiguous and not currently supported.

ds_xr = xr.open_dataset(filepath)
# ValueError: Failed to decode variable 'time': unable to decode time units 'months since 1983-06' # with 'the default calendar'. Try opening your dataset with decode_times=False or installing
# cftime if it is not installed.

Workaround

This file can be opened with decode_times=False (which is why it worked in the CDAT codebase), but that means modifying the Xarray code to accommodate this specific dataset. I'm not a fan of logic to accommodate bad data.

Unless we can get the time coordinates fixed, this might be the only other option. Happy to hear your thoughts.

e3sm_diags/e3sm_diags/driver/utils/dataset_xr.py

Line 310 in 9be8cea

ds = self._open_climo_dataset(filepath)
e3sm_diags/e3sm_diags/driver/utils/dataset_xr.py

Line 442 in 9be8cea

ds = self._open_climo_dataset(filepath)

e3sm_diags/e3sm_diags/driver/utils/dataset_xr.py

Lines 490 to 503 in 9be8cea

    
           # Time coordinates are decoded because there might be cases where 
        
           # a multi-file climatology dataset has different units between files 
        
           # but raw encoded time values overlap. Decoding with Xarray allows 
        
           # concatenation of datasets with this issue (e.g., `area_cycle_zonal_mean` 
        
           # set with the MERRA2_Aerosols climatology datasets). 
        
           # NOTE: This GitHub issue explains why the "coords" and "compat" args 
        
           # are defined as they are below: https://github.com/xCDAT/xcdat/issues/641 
        
           args = { 
        
               "paths": filepath, 
        
               "decode_times": True, 
        
               "add_bounds": ["X", "Y"], 
        
               "coords": "minimal", 
        
               "compat": "override", 
        
           }

tomvothecoder · 2024-09-09T19:16:30Z

To correct myself, xCDAT can handle a missing "calendar" attribute by defaulting it to standard (Example: 2024-09-09 11:55:22,114 [WARNING]: dataset.py(decode_time:360) >> 'time' does not have a calendar attribute set. Defaulting to CF 'standard' calendar.)

# ValueError: Non-integer years and months are ambiguous and not currently supported.

Root cause of this issue:

Related dataset: /global/cfs/cdirs/e3sm/e3sm_diags/obs_for_e3sm_diags/climatology/ISCCPCOSP/ISCCPCOSP_ANN_climo.nc

This issue is that this dataset uses a time scalar float of 150.5, which represents the middle of the month. xCDAT's decoding logic uses the relativedelta module, which expects the time coordinates to be placed at the start of each the month (e.g., 150, 151). As a result, the relativedelta module raises ValueError: Non-integer years and months are ambiguous and not currently supported.

MVCE based on xCDAT logic

from datetime import datetime
from dateutil import parser
from dateutil import relativedelta as rd

import numpy as np

flat_offsets = np.array([150.5])
ref_date = "1983-06"
units_type = "months"

ref_datetime: datetime = parser.parse(ref_date, default=datetime(2000, 1, 1))

# ValueError: Non-integer years and months are ambiguous and not currently supported.
times = np.array(
    [
        ref_datetime + rd.relativedelta(**{units_type: offset})
        for offset in flat_offsets
    ],
    dtype="object",
)

Possible workarounds

Update dataset to replace scalar value 150.5 (representing 12/15/1995 or 12/16/1995 I think) with 150.0 (representing 12/01/1995)
- Maybe low risk because it is a single time coordinate and it is still within the same month of the climatology. I don't think this time coordinate is used downstream?

chengzhuzhang · 2024-09-09T19:33:29Z

Possible workarounds

1. Update dataset to replace scalar value 150.5 (representing 12/15/1995 or 12/16/1995 I think) with 150.0 (representing 12/01/1995)
   
   * Maybe low risk because it is a single time coordinate and it is still within the same month of the climatology. I don't think this time coordinate is used downstream?

Hi Tom, I think the fractional month could be ambiguous and that's why relativedelta does not support that. I think the workaround you proposed would be fine in this case. I don't think the time coordinate is used later.

tomvothecoder · 2024-09-23T18:21:35Z

Hey @chengzhuzhang, the file with the fixed time coordinate that needs to be copied from NERSC to LCRC is /global/cfs/cdirs/e3sm/e3sm_diags/obs_for_e3sm_diags/climatology/ISCCPCOSP/ISCCPCOSP_ANN_climo.nc.

I created a backup of this file in the same directory on NERSC just in case. The backup has .bak appended to the filename.

tomvothecoder · 2024-09-24T19:31:28Z

Findings for the "Test run script with model_vs_model run" task

main

Viewer link
128 .nc files (not including diffs)
143 .png files (not including diffs)
Sets: area_mean_time_series, arm_diags, lat_lon, mp_partition

dev

Viewer link
1040 .nc files (not including diffs)
635 .png files (not including diffs)
Sets: annual_cycle_zonal_mean, area_mean_time_series, arm_diags, cosp_histogram, lat_lon, meridional_mean_2d, mp_partition, polar, zonal_mean_2d, zonal_mean_2d_stratosphere, zonal_mean_xy

Why is `main` producing less files?

In the main log file, the specific errors that I think are preventing the generation of the remaining files are:

OSError: No file found for  and JJA in /global/cfs/cdirs/e3sm/diagnostics/observations/Atm/climatology
OSError: No file found for  and SON in /global/cfs/cdirs/e3sm/diagnostics/observations/Atm/climatology
OSError: No file found for  and MAM in /global/cfs/cdirs/e3sm/diagnostics/observations/Atm/climatology
OSError: No file found for  and DJF in /global/cfs/cdirs/e3sm/diagnostics/observations/Atm/climatology

What I think is happening

Dev branch -- dataset_xr.py has get_ref_climo() method that is being used for most/all sets. This method has a try and except statement that catches the IOError from _get_climo_filepath(), allowing sets to continue to run as a model-only run by substituting the test data for the reference data.

e3sm_diags/e3sm_diags/driver/utils/dataset_xr.py

Lines 318 to 369 in 9be8cea

    
               def get_ref_climo_dataset( 
        
                   self, var_key: str, season: ClimoFreq, ds_test: xr.Dataset 
        
               ): 
        
                   """Get the reference climatology dataset for the variable and season. 
        
                   If the reference climatatology does not exist or could not be found, it 
        
                   will be considered a model-only run. For this case the test dataset 
        
                   is returned as a default value and subsequent metrics calculations will 
        
                   only be performed on the original test dataset. 
        
                   Parameters 
        
                   ---------- 
        
                   var_key : str 
        
                       The key of the variable. 
        
                   season : CLIMO_FREQ 
        
                       The climatology frequency. 
        
                   ds_test : xr.Dataset 
        
                       The test dataset, which is returned if the reference climatology 
        
                       does not exist or could not be found. 
        
                   Returns 
        
                   ------- 
        
                   xr.Dataset 
        
                       The reference climatology if it exists or a copy of the test dataset 
        
                       if it does not exist. 
        
                   Raises 
        
                   ------ 
        
                   RuntimeError 
        
                       If `self.data_type` is not "ref". 
        
                   """ 
        
                   # TODO: This logic was carried over from legacy implementation. It 
        
                   # can probably be improved on by setting `ds_ref = None` and not 
        
                   # performing unnecessary operations on `ds_ref` for model-only runs, 
        
                   # since it is the same as `ds_test`. In addition, returning ds_test 
        
                   # makes it difficult for debugging. 
        
                   if self.data_type == "ref": 
        
                       try: 
        
                           ds_ref = self.get_climo_dataset(var_key, season) 
        
                           self.model_only = False 
        
                       except (RuntimeError, IOError): 
        
                           ds_ref = ds_test.copy() 
        
                           self.model_only = True 
        
                           logger.info("Cannot process reference data, analyzing test data only.") 
        
                   else: 
        
                       raise RuntimeError( 
        
                           "`Dataset._get_ref_dataset` only works with " 
        
                           f"`self.data_type == 'ref'`, not {self.data_type}." 
        
                       ) 
        
                   return ds_ref

Main branch -- dataset.py has get_climo_variable() and only lat_lon set has the try and except for model-only runs. This means all other sets will fail, leading to less files being produced compared to dev.

e3sm_diags/e3sm_diags/driver/utils/dataset.py

Lines 120 to 173 in 9be8cea

    
               def get_climo_variable(self, var, season, extra_vars=[], *args, **kwargs): 
        
                   """ 
        
                   For a given season, get the variable and any extra variables and run 
        
                   the climatology on them. 
        
                   These variables can either be from the test data or reference data. 
        
                   """ 
        
                   self.var = var 
        
                   self.extra_vars = extra_vars 
        
                   if not self.var: 
        
                       raise RuntimeError("Variable is invalid.") 
        
                   if not season: 
        
                       raise RuntimeError("Season is invalid.") 
        
                   # We need to make two decisions: 
        
                   # 1) Are the files being used reference or test data? 
        
                   #    - This is done with self.ref and self.test. 
        
                   # 2) Are the files being used climo or timeseries files? 
        
                   #   - This is done with the ref_timeseries_input and test_timeseries_input parameters. 
        
                   if self.ref and self.is_timeseries(): 
        
                       # Get the reference variable from timeseries files. 
        
                       data_path = self.parameters.reference_data_path 
        
                       timeseries_vars = self._get_timeseries_var(data_path, *args, **kwargs) 
        
                       # Run climo on the variables. 
        
                       variables = [self.climo_fcn(v, season) for v in timeseries_vars] 
        
                   elif self.test and self.is_timeseries(): 
        
                       # Get the test variable from timeseries files. 
        
                       data_path = self.parameters.test_data_path 
        
                       timeseries_vars = self._get_timeseries_var(data_path, *args, **kwargs) 
        
                       # Run climo on the variables. 
        
                       variables = [self.climo_fcn(v, season) for v in timeseries_vars] 
        
                   elif self.ref: 
        
                       # Get the reference variable from climo files. 
        
                       filename = self.get_ref_filename_climo(season) 
        
                       variables = self._get_climo_var(filename, *args, **kwargs) 
        
                   elif self.test: 
        
                       # Get the test variable from climo files. 
        
                       filename = self.get_test_filename_climo(season) 
        
                       variables = self._get_climo_var(filename, *args, **kwargs) 
        
                   else: 
        
                       msg = "Error when determining what kind (ref or test) " 
        
                       msg += "of variable to get and where to get it from " 
        
                       msg += "(climo or timeseries files)." 
        
                       raise RuntimeError(msg) 
        
                   # Needed so we can do: 
        
                   #   v1 = Dataset.get_variable('v1', season) 
        
                   # and also: 
        
                   #   v1, v2, v3 = Dataset.get_variable('v1', season, extra_vars=['v2', 'v3']) 
        
                   return variables[0] if len(variables) == 1 else variables

lat_lon code:

e3sm_diags/e3sm_diags/driver/lat_lon_driver.py

Lines 149 to 156 in 2b303df

    
           mv1 = test_data.get_climo_variable(var, season) 
        
           try: 
        
               mv2 = ref_data.get_climo_variable(var, season) 
        
           except (RuntimeError, IOError): 
        
               mv2 = mv1 
        
               logger.info("Can not process reference data, analyse test data only") 
        
               parameter.model_only = True

Options

Update all sets to use get_climo_dataset() and only add try and except to lat_lon, then remove get_ref_climo_dataset() -- cleanest solution
Update logic on get_ref_climo_dataset() to perform model-only run for lat_lon if reference climo file is not found -- easiest solution
Keep things as is if we want to perform model-only runs -- note, this might lead to slower overall runtime compared to main since more sets and variables are being processed

tomvothecoder · 2024-09-24T19:32:06Z

Hey @chengzhuzhang, can you provide your thoughts to my above comment?

chengzhuzhang · 2024-09-24T20:01:18Z

Hey @chengzhuzhang, can you provide your thoughts to my above comment?

Hi @tomvothecoder I'm voting option 1 as the best. It can be a good opportunity to also address this relevant issue? #823 .

- Refactor lat_lon driver to split up functions used for model-only run - remove `_get_ref_climo_dataset` from `dataset_xr.py`

chengzhuzhang · 2024-09-25T23:20:09Z

Hey @chengzhuzhang, the file with the fixed time coordinate that needs to be copied from NERSC to LCRC is /global/cfs/cdirs/e3sm/e3sm_diags/obs_for_e3sm_diags/climatology/ISCCPCOSP/ISCCPCOSP_ANN_climo.nc.

I created a backup of this file in the same directory on NERSC just in case. The backup has .bak appended to the filename.

@tomvothecoder I have copied the data to LCRC input data server. We can run mache to sync data to all machines once new e3sm_unified version is deployed.

- Fix area being dropped in dataset class

tomvothecoder · 2024-09-27T00:37:14Z

Hey @chengzhuzhang, the file with the fixed time coordinate that needs to be copied from NERSC to LCRC is /global/cfs/cdirs/e3sm/e3sm_diags/obs_for_e3sm_diags/climatology/ISCCPCOSP/ISCCPCOSP_ANN_climo.nc.
I created a backup of this file in the same directory on NERSC just in case. The backup has .bak appended to the filename.

@tomvothecoder I have copied the data to LCRC input data server. We can run mache to sync data to all machines once new e3sm_unified version is deployed.

I found a bug with that file where the variable values were replaced with 0.0 for some reason. I recreated the file on NERSC.

Can you point me to the path and I will copy it over again?

chengzhuzhang · 2024-09-27T02:38:04Z

Can you point me to the path and I will copy it over again?

The LCRC path is /lcrc/group/e3sm/public_html/diagnostics/observations/Atm/climatology/ISCCPCOSP. Thank you for taking care of this @tomvothecoder

tomvothecoder · 2024-09-27T18:33:40Z

Can you point me to the path and I will copy it over again?

The LCRC path is /lcrc/group/e3sm/public_html/diagnostics/observations/Atm/climatology/ISCCPCOSP. Thank you for taking care of this @tomvothecoder

I can't overwrite the file due to permissions and different group access.

-rw-r--r--  1 ac.zhang40 cels    2238458 Sep  9 17:12 ISCCPCOSP_ANN_climo.nc
-rwxr-xr-x  1 ac.zhang40 E3SM    2162972 Nov  5  2019 ISCCPCOSP_ANN_climo.nc.bak
-rwxr-xr-x  1 ac.zhang40 E3SM    2162972 Nov  5  2019 ISCCPCOSP_DJF_climo.nc
-rwxr-xr-x  1 ac.zhang40 E3SM    2162972 Nov  5  2019 ISCCPCOSP_JJA_climo.nc
-rwxr-xr-x  1 ac.zhang40 E3SM    2162972 Nov  5  2019 ISCCPCOSP_MAM_climo.nc
-rwxr-xr-x  1 ac.zhang40 E3SM    2162972 Nov  5  2019 ISCCPCOSP_SON_climo.nc

Whenever you have time, can you move the latest fixed version of this file from NERSC? Thanks!

tomvothecoder

Regression test results

examples scripts -- all run, within rtol
model_vs_model -- 124/128 variables within rtol, 4/128 not (but okay due to bug on main -> refer to PR description
model_vs_obs -- 1213/1215 variables within rtol, 2/1215 not within rtol due to <=8/64800 elements mismatching (0.04% and 1.2% max rel diff) -- not a concern

Summary of Changes

e3sm_diags/driver/

annual_cycle_zonal_mean_driver.py, cosp_histogram_driver.py, meridional_mean_2d_driver.py, polar_driver.py, zonal_mean_2d_driver.py, zonal_mean_xy_driver.py -- replaced get_ref_climo_dataset() with get_climo_dataset()
lat_lon_driver.py
- Add _run_diags_2d_model_only() and _run_diags_3d_model_only() functions (when ds_ref reference dataset is None)
- Add _get_ref_climo_dataset() -- now returns None if reference dataset is not passed instead of setting ds_ref = ds_test
- Refactor _create_metrics_dict()
  - Use DEFAULT_METRICS_VALUE (999.99) as default value for missing metrics to support viewer, otherwise it will break if None

driver/utils/dataset_xr.py

Remove get_ref_climo_dataset() since it is only used by lat_lon_driver.py
Fix _subset_vars_and_load() to keep "area" variable

Run scripts

Delete all outdated run_....py scripts
Rename run_v_2_9_0_all_sets_E3SM_machines.py to run_all_sets_E3SM_machine.py -- the filename can be version agonistic and updated based on the latest version

Update complete_run.py

Port over functions from run_v2_6_0_all_sets.py needed in this module

tomvothecoder · 2024-09-27T19:05:27Z

e3sm_diags/driver/lat_lon_driver.py

@@ -1,12 +1,15 @@
 from __future__ import annotations


A file of interest for review

tomvothecoder

Misc. findings

e3sm_diags/driver/zonal_mean_xy_driver.py

e3sm_diags/driver/lat_lon_driver.py

chengzhuzhang · 2024-09-27T19:23:12Z

Whenever you have time, can you move the latest fixed version of this file from NERSC? Thanks!

This one is done!

docs/source/dev_guide/using-output-viewer.rst

tomvothecoder · 2024-09-27T20:43:05Z

docs/source/index.rst

-  interacts effectively with the PCMDI's metrics package and the ARM
-   diagnostics package through a unifying framework: `Community
-   Diagnostics Package (CDP) <https://github.com/CDAT/cdp>`_.


e3sm_diags no longer uses CDP

tomvothecoder · 2024-09-27T21:27:23Z

Merging this PR based on my review comment above.

…846)

Add regression testing notebook and scripts

910cc68

- Delete outdated run scripts

tomvothecoder changed the base branch from main to cdat-migration-fy24 September 5, 2024 22:50

Add debugging script and cfg

4cd635a

tomvothecoder added 5 commits September 10, 2024 10:07

Update regression testing notebook

77d7605

Update dir structure for testing material

8d1257f

Add run_type arg to run_set() in run script

ab974b4

Rename run script

1ccb8bd

Change model_vs_model to model_vs_obs

40b4944

tomvothecoder self-assigned this Sep 10, 2024

tomvothecoder added the cdat-migration-fy24 CDAT Migration FY24 Task label Sep 10, 2024

tomvothecoder added this to the FY24 Q4 (07/01/24 - 9/30/24) milestone Sep 10, 2024

Add log files and update notebooks

9735b66

tomvothecoder added 2 commits September 23, 2024 15:39

Rename directories and update notebook names

5be32a4

Update scripts and notebooks

c3bfa55

tomvothecoder added 3 commits September 24, 2024 14:01

Replace get_ref_climo_dataset() with get_climo_dataset()

cfd268d

Refactor lat_lon_driver.py

49ad950

- Refactor lat_lon driver to split up functions used for model-only run - remove `_get_ref_climo_dataset` from `dataset_xr.py`

Update model_vs_model regression testing notebook

36290d0

tomvothecoder added 2 commits September 26, 2024 14:00

Fix logic for 3D variable runs in lat_lon

5b5ba04

- Fix area being dropped in dataset class

Remove unused files

2cb644a

tomvothecoder mentioned this pull request Sep 26, 2024

[Bug]: AOD_550 not added to annual_cycle_zonal_mean .cfg files #852

Closed

Add final regression testing notebooks

dd03561

tomvothecoder added 2 commits September 27, 2024 08:13

Update notebooks

947b3d5

Rename run all script and fix complete_run.py script

0d7249e

tomvothecoder commented Sep 27, 2024

View reviewed changes

tomvothecoder marked this pull request as ready for review September 27, 2024 19:07

tomvothecoder commented Sep 27, 2024

View reviewed changes

e3sm_diags/driver/zonal_mean_xy_driver.py Outdated Show resolved Hide resolved

e3sm_diags/driver/lat_lon_driver.py Outdated Show resolved Hide resolved

e3sm_diags/driver/lat_lon_driver.py Outdated Show resolved Hide resolved

Apply suggestions from code review

85c820e

tomvothecoder added 4 commits September 27, 2024 12:23

Update run scripts to capture performance benchmarks

66f68bb

Update CDP references

7af196d

Remove deprecated cdp .rst file

b3ae28e

Remove remaining cdp references

629fb3e

tomvothecoder changed the title ~~[Refactor]: CDAT Migration Phase 3: testing, documentation update, prepare for new release~~ [Refactor]: CDAT Migration Phase 3: testing, performance benchmark, documentation update Sep 27, 2024

tomvothecoder mentioned this pull request Sep 27, 2024

[Bug]: model_vs_model lat_lon variables not generated on main but generated on cdat-migration-fy24 #854

Open

tomvothecoder commented Sep 27, 2024

View reviewed changes

Apply suggestions from code review

9484020

tomvothecoder changed the title ~~[Refactor]: CDAT Migration Phase 3: testing, performance benchmark, documentation update~~ [Refactor]: CDAT Migration Phase 3: testing and documentation update Sep 27, 2024

tomvothecoder merged commit d95a9a3 into cdat-migration-fy24 Sep 27, 2024
4 checks passed

tomvothecoder deleted the migration-phase3 branch September 27, 2024 21:27

tomvothecoder mentioned this pull request Sep 30, 2024

[Refactor]: CDAT Migration Phase 3: testing, documentation update #843

Closed

tomvothecoder added a commit that referenced this pull request Oct 1, 2024

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#…

8fc5d49

…846)

tomvothecoder added a commit that referenced this pull request Oct 25, 2024

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#…

6abb8e9

…846)

tomvothecoder mentioned this pull request Oct 28, 2024

CDAT Migration TODO: Update logic in get_ref_climo_dataset when ref data not available #823

Closed

tomvothecoder added a commit that referenced this pull request Oct 29, 2024

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#…

6a0fb91

…846)

tomvothecoder added a commit that referenced this pull request Oct 29, 2024

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#…

c781adb

…846)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor]: CDAT Migration Phase 3: testing and documentation update #846

[Refactor]: CDAT Migration Phase 3: testing and documentation update #846

tomvothecoder commented Sep 5, 2024 •

edited

Loading

tomvothecoder commented Sep 5, 2024

tomvothecoder commented Sep 9, 2024 •

edited

Loading

chengzhuzhang commented Sep 9, 2024

Possible workarounds

tomvothecoder commented Sep 23, 2024

tomvothecoder commented Sep 24, 2024 •

edited

Loading

tomvothecoder commented Sep 24, 2024

chengzhuzhang commented Sep 24, 2024

chengzhuzhang commented Sep 25, 2024 •

edited

Loading

tomvothecoder commented Sep 27, 2024

chengzhuzhang commented Sep 27, 2024

tomvothecoder commented Sep 27, 2024

tomvothecoder left a comment •

edited

Loading

tomvothecoder Sep 27, 2024

tomvothecoder left a comment

chengzhuzhang commented Sep 27, 2024

tomvothecoder Sep 27, 2024

tomvothecoder commented Sep 27, 2024

[Refactor]: CDAT Migration Phase 3: testing and documentation update #846

[Refactor]: CDAT Migration Phase 3: testing and documentation update #846

Conversation

tomvothecoder commented Sep 5, 2024 • edited Loading

Description

1. Examples

Tasks

2. Run script with model_vs_model

Tasks

3. Documentation

Tasks

Checklist

tomvothecoder commented Sep 5, 2024

Root Cause

Checking time attributes (with decode_times=False)

Breaks with decode_times=True

Workaround

tomvothecoder commented Sep 9, 2024 • edited Loading

Root cause of this issue:

MVCE based on xCDAT logic

Possible workarounds

chengzhuzhang commented Sep 9, 2024

Possible workarounds

tomvothecoder commented Sep 23, 2024

tomvothecoder commented Sep 24, 2024 • edited Loading

Findings for the "Test run script with model_vs_model run" task

Why is main producing less files?

What I think is happening

Options

tomvothecoder commented Sep 24, 2024

chengzhuzhang commented Sep 24, 2024

chengzhuzhang commented Sep 25, 2024 • edited Loading

tomvothecoder commented Sep 27, 2024

chengzhuzhang commented Sep 27, 2024

tomvothecoder commented Sep 27, 2024

tomvothecoder left a comment • edited Loading

Choose a reason for hiding this comment

Regression test results

Summary of Changes

tomvothecoder Sep 27, 2024

Choose a reason for hiding this comment

tomvothecoder left a comment

Choose a reason for hiding this comment

chengzhuzhang commented Sep 27, 2024

tomvothecoder Sep 27, 2024

Choose a reason for hiding this comment

tomvothecoder commented Sep 27, 2024

tomvothecoder commented Sep 5, 2024 •

edited

Loading

2. Run script with `model_vs_model`

Checking time attributes (with `decode_times=False`)

Breaks with `decode_times=True`

tomvothecoder commented Sep 9, 2024 •

edited

Loading

tomvothecoder commented Sep 24, 2024 •

edited

Loading

Why is `main` producing less files?

chengzhuzhang commented Sep 25, 2024 •

edited

Loading

tomvothecoder left a comment •

edited

Loading