-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial commit of modified hopper workflow files. #6
base: develop
Are you sure you want to change the base?
Conversation
@ytangnoaa @zmoon @bbakernoaa With a new workaround I put in for Hopper and some changes to the machine file for the Hopper sbatch default partition and queue/account, it seems the above steps and of course the following command can successfully generate AND launch workflow tasks to Hopper, and adds it to your crontab:
|
The launch_FV3LAM_wflow.sh workaround on Hopper (by resetting modules) is not really functional though when the tasks run, as we still need to have the correct modules loaded: Loading modules for task "get_extrn_ics" ...
Need a better idea to get the correct modules loaded in, for example on Hera: Currently Loaded Modules:
We need to get the regional_workflow to be loaded successfully as on Hera (e.g., "miniconda_regional_workflow") during launch_FV3LAM_wflow.sh. |
Patrick, the jinja2 is used in python package, which is currently included in However, we still missed "f90nml" in that python package. Build another |
Thanks Youhua. Maybe I'm wrong, but I've created and activated the
'regional_workflow' environment (from included Hopper environment yamal
file) that contains jinja2 and the other packages needed to successfully
generate the workflow. How to get these also loaded when launching
workflow, I think is the question.
Can you help test?
…On Fri, Mar 24, 2023, 2:02 PM Youhua Tang ***@***.***> wrote:
The launch_FV3LAM_wflow.sh workaround on Hopper (by resetting modules) is
not really functional though when the tasks run, as we still need to have
the correct modules loaded:
Loading modules for task "get_extrn_ics" ... Currently Loaded Modules:
1. use.own 4) gnu9/9.3.0 7) hwloc/2.1.0
2. autotools 5) ucx/1.8.0 8) openmpi4/4.0.4
3. prun/2.0 6) libfabric/1.10.1 9) hosts/hopper
...
ModuleNotFoundError: No module named 'jinja2'
Need a better idea to get the correct modules loaded in, for example on
Hera: Loading modules for task "get_extrn_ics" ...
Currently Loaded Modules:
1. hpss/hpss 3) *miniconda_regional_workflow*
2. miniconda3/4.12.0 4) get_extrn_ics.local
We need to get the regional_workflow to be loaded successfully as on Hera
(e.g., "miniconda_regional_workflow") during launch_FV3LAM_wflow.sh.
Patrick, the jinja2 is used in python package, which is currently included
in
python/3.9.9-jh
However, we still missed "f90nml" in that python package. Build another
miniconda version of python may help
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGLFYNXWE46UPOPOSMQD65LW5XOUVANCNFSM6AAAAAAWCZDP5E>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
We can also just use rocoto to launch them instead of the launch bash script
|
The regional_workflow's python has "jinja2". However, when launching the tasks via rocoto, the actually loaded modules are
miniconda3/22.11.1-gy has no "jinja2" module |
Exactly, there needs to be a regional workflow module, or the compute nodes
need access to the "regional_workflow" conda environment used in generating
the workflow (after modules are reset to avoid conflicts in current method).
…On Fri, Apr 14, 2023, 3:37 PM Youhua Tang ***@***.***> wrote:
Thanks Youhua. Maybe I'm wrong, but I've created and activated the
'regional_workflow' environment (from included Hopper environment yamal
file) that contains jinja2 and the other packages needed to successfully
generate the workflow. How to get these also loaded when launching
workflow, I think is the question. Can you help test?
… <#m_203386944870722202_>
On Fri, Mar 24, 2023, 2:02 PM Youhua Tang *@*.*> wrote: The
launch_FV3LAM_wflow.sh workaround on Hopper (by resetting modules) is not
really functional though when the tasks run, as we still need to have the
correct modules loaded: Loading modules for task "get_extrn_ics" ...
Currently Loaded Modules: 1. use.own 4) gnu9/9.3.0 7) hwloc/2.1.0 2.
autotools 5) ucx/1.8.0 8) openmpi4/4.0.4 3. prun/2.0 6) libfabric/1.10.1 9)
hosts/hopper ... ModuleNotFoundError: No module named 'jinja2' Need a
better idea to get the correct modules loaded in, for example on Hera:
Loading modules for task "get_extrn_ics" ... Currently Loaded Modules: 1.
hpss/hpss 3) miniconda_regional_workflow 2. miniconda3/4.12.0 4)
get_extrn_ics.local We need to get the regional_workflow to be loaded
successfully as on Hera (e.g., "miniconda_regional_workflow") during
launch_FV3LAM_wflow.sh. Patrick, the jinja2 is used in python package,
which is currently included in python/3.9.9-jh However, we still missed
"f90nml" in that python package. Build another miniconda version of python
may help — Reply to this email directly, view it on GitHub <#6 (comment)
<#6 (comment)>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AGLFYNXWE46UPOPOSMQD65LW5XOUVANCNFSM6AAAAAAWCZDP5E
<https://github.com/notifications/unsubscribe-auth/AGLFYNXWE46UPOPOSMQD65LW5XOUVANCNFSM6AAAAAAWCZDP5E>
. You are receiving this because you authored the thread.Message ID: @.*>
The regional_workflow's python has "jinja2". However, when launching the
tasks via rocoto, the actually loaded modules are
1. ucx/1.8.0 6) prun/2.0 11) rocoto/1.3.5
2. libfabric/1.10.1 7) hosts/hopper 12) miniconda3/22.11.1-gy
3. hwloc/2.1.0 8) gnu10/10.3.0-ya 13) wflow_hopper
4. use.own 9) zlib/1.2.11-2y
5. autotools 10) ruby/3.1.0-4e
miniconda3/22.11.1-gy has no "jinja2" module
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGLFYNSGOCQZZZQ2MGL6VLTXBGRQJANCNFSM6AAAAAAWCZDP5E>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
We should just continue to follow Hera example, but haven't had time to
get back to this.
On Fri, Apr 14, 2023, 4:09 PM Patrick Campbell ***@***.***>
wrote:
… Exactly, there needs to be a regional workflow module, or the compute
nodes need access to the "regional_workflow" conda environment used in
generating the workflow (after modules are reset to avoid conflicts in
current method).
On Fri, Apr 14, 2023, 3:37 PM Youhua Tang ***@***.***>
wrote:
> Thanks Youhua. Maybe I'm wrong, but I've created and activated the
> 'regional_workflow' environment (from included Hopper environment yamal
> file) that contains jinja2 and the other packages needed to successfully
> generate the workflow. How to get these also loaded when launching
> workflow, I think is the question. Can you help test?
> … <#m_2118272636231586152_m_203386944870722202_>
> On Fri, Mar 24, 2023, 2:02 PM Youhua Tang *@*.*> wrote: The
> launch_FV3LAM_wflow.sh workaround on Hopper (by resetting modules) is not
> really functional though when the tasks run, as we still need to have the
> correct modules loaded: Loading modules for task "get_extrn_ics" ...
> Currently Loaded Modules: 1. use.own 4) gnu9/9.3.0 7) hwloc/2.1.0 2.
> autotools 5) ucx/1.8.0 8) openmpi4/4.0.4 3. prun/2.0 6) libfabric/1.10.1 9)
> hosts/hopper ... ModuleNotFoundError: No module named 'jinja2' Need a
> better idea to get the correct modules loaded in, for example on Hera:
> Loading modules for task "get_extrn_ics" ... Currently Loaded Modules: 1.
> hpss/hpss 3) miniconda_regional_workflow 2. miniconda3/4.12.0 4)
> get_extrn_ics.local We need to get the regional_workflow to be loaded
> successfully as on Hera (e.g., "miniconda_regional_workflow") during
> launch_FV3LAM_wflow.sh. Patrick, the jinja2 is used in python package,
> which is currently included in python/3.9.9-jh However, we still missed
> "f90nml" in that python package. Build another miniconda version of python
> may help — Reply to this email directly, view it on GitHub <#6 (comment)
> <#6 (comment)>>,
> or unsubscribe
> https://github.com/notifications/unsubscribe-auth/AGLFYNXWE46UPOPOSMQD65LW5XOUVANCNFSM6AAAAAAWCZDP5E
> <https://github.com/notifications/unsubscribe-auth/AGLFYNXWE46UPOPOSMQD65LW5XOUVANCNFSM6AAAAAAWCZDP5E>
> . You are receiving this because you authored the thread.Message ID: @.*>
>
> The regional_workflow's python has "jinja2". However, when launching the
> tasks via rocoto, the actually loaded modules are
>
> 1. ucx/1.8.0 6) prun/2.0 11) rocoto/1.3.5
> 2. libfabric/1.10.1 7) hosts/hopper 12) miniconda3/22.11.1-gy
> 3. hwloc/2.1.0 8) gnu10/10.3.0-ya 13) wflow_hopper
> 4. use.own 9) zlib/1.2.11-2y
> 5. autotools 10) ruby/3.1.0-4e
>
> miniconda3/22.11.1-gy has no "jinja2" module
>
> —
> Reply to this email directly, view it on GitHub
> <#6 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AGLFYNSGOCQZZZQ2MGL6VLTXBGRQJANCNFSM6AAAAAAWCZDP5E>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
|
This draft PR is a work in progress of the initial files needed to create a working conda environment and generate a workflow for the UFS-SRW-App [develop] branch for a ATMAQ build on Hopper.
Again to build the UFS-SRW-App on Hopper (using our develop branches of UWM, UFS_UTILS, and AQM-utils) from previous closed PR #5:
Following build procedures on Hopper of UFS-SRW-App, the following will successfully create an environment and generate a workflow on Hopper:
At the bottom of the environment yaml file,
environment_hopper_wflow.yml
, be sure to change the prefix pathprefix: /groups/ESS/pcampbe8/anaconda3/envs/regional_workflow
to your local directory where you store conda environments. After you initially create your new environment, you only need to reactivate it to generate new workflowsconda activate regional_workflow
.There exists some conflicts when activating the conda environment and loaded modules, hence the
module reset
before generating the workflow. Thus, when performing the subsequent launch workflow, e.g.,cd /scratch/pcampbe8/expt_dirs/aqm_community_aqmna13 && ./launch_FV3LAM_wflow.sh called_from_cron="TRUE"
, there is a similar issue reloading modules vs. conda environment. I also add the rocoto module path to my .bashrc so that it can find the necessary rocotorun commands:export PATH="/opt/sw/other/apps/rocoto/bin/:$PATH"
. Further modifications are needed in this draft to avoid these errors.@ytangnoaa @zmoon @bbakernoaa I appreciate if other tests could be done on your end with this method.
DESCRIPTION OF CHANGES:
This PR is a draft work in progress of the initial files needed to create a working conda environment and generate a workflow for the UFS-SRW-App [develop] branch for a ATMAQ build on Hopper.
Type of change
TESTS CONDUCTED:
DEPENDENCIES:
DOCUMENTATION:
ISSUE:
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):