Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore replacement of cdscan #80

Open
chengzhuzhang opened this issue Jun 21, 2021 · 4 comments · Fixed by #519
Open

Explore replacement of cdscan #80

chengzhuzhang opened this issue Jun 21, 2021 · 4 comments · Fixed by #519
Assignees

Comments

@chengzhuzhang
Copy link
Collaborator

chengzhuzhang commented Jun 21, 2021

Recent e3sm-unified testing suggested that additional patch is needed on conda-forge forcdscan(which is a component of cdat). It would be nice to come up with a replacement. cdscan is used to concatenate files before running e3sm-diags and global mean time-series. One possibility is to use xarray's IO to read-in multiple files.

@chengzhuzhang
Copy link
Collaborator Author

chengzhuzhang commented Jun 21, 2021

@tomvothecoder Here are some data from a chrysalis run for testing /lcrc/group/e3sm/ac.forsyth2/E3SM_simulations/20210528.v2rc3e.piControl.ne30pg2_EC30to60E2r2.chrysalis. It would be a useful test for cdat ->xcdat transition to see if we can drop support to cdscan, and use xarray instead....

@tomvothecoder
Copy link
Collaborator

Just some notes about the task, feedback is welcome.

We want to deprecate cdscan, which is called through e3sm-unified, with another library to convert/concatenate .nc/.txt files to .xml.

Steps:

  • 1) Check if xarray's IO supports .txt inputs and .xml outputs
    • If xarray does not meet our requirements, pandas might be a possible alternative.
    • Pandas supports reading .txt files into DataFrames, which can be converted to .xml using .to_xml().
    • xarray has cross-compatibility with pandas
  • 2) Compare xarray/pandas xml output vs. cdscan
    • If cdscan performs some unique/complex operations, we might need to write our own concatenation function using xarray/pandas.
  • 3) Adopt new tool/function with zppy bash scripts

Lines where cdscan is called:
zppy/templates/global_time_series.bash

zppy/templates/e3sm_diags.bash

  • variables="{{ vars }}"
    for v in ${variables//,/ }
    do
    # Go through the time series files for between year1 and year2, using a step size equal to the number of years per time series file
    for (( year=${y1}; year<=${y2}; year+={{ ts_num_years }} ))
    do
    YYYY=`printf "%04d" ${year}`
    for file in ${ts_dir}/${v}_${YYYY}*.nc
    do
    # Add this time series file to the list of files for cdscan to use
    echo ${file} >> ${v}_files.txt
    done
    done
    # xml file will cover the whole period from year1 to year2
    xml_name=${v}_${Y1}01_${Y2}12.xml
    cdscan -x ${xml_name} -f ${v}_files.txt
    if [ $? != 0 ]; then
    cd ../..
    echo 'ERROR (4)' > {{ prefix }}.status
    exit 1
    fi
    done
    cd ..

@tomvothecoder
Copy link
Collaborator

@forsyth2 I tagged you as the primary issue assignee. I'll provide support if needed.

@forsyth2
Copy link
Collaborator

Reopening since #519 was a partial resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants