-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add GPU AMIP scaling runs #673
Conversation
f7a9b60
to
8aeaeed
Compare
8aeaeed
to
41dff6f
Compare
4d1f1a2
to
9d69913
Compare
See here: - SLURM_GPU_BIND: none # https://github.com/open-mpi/ompi/issues/11949#issuecomment-1737712291
+ SLRUM_GRES_FLAGS: "allow-task-sharing" |
48844ec
to
503612c
Compare
e8da583
to
c357da3
Compare
c357da3
to
9031045
Compare
2e39025
to
187e958
Compare
187e958
to
f77dfab
Compare
@@ -507,6 +507,9 @@ function update_surface_fractions!(cs::CoupledSimulation) | |||
cs.surface_fractions.ice .= max.(min.(ice_d, FT(1) .- land_s), FT(0)) | |||
cs.surface_fractions.ocean .= max.(FT(1) .- (cs.surface_fractions.ice .+ land_s), FT(0)) | |||
|
|||
comms_ctx = axes(land_s).grid.topology.context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comms_ctx = axes(land_s).grid.topology.context | |
comms_ctx = ClimaComms.context(land_s) |
I think we have a helper for this, so that you don't need to use internals.
key: "gpu_amip_dyamond_ws" | ||
command: | ||
- > | ||
julia --threads=3 --color=yes --project=experiments/AMIP experiments/AMIP/coupler_driver.jl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an nsys profile
to the non-MPI jobs?
CHAP is no longer our target; we want to set up scaling runs using the benchmarks pipeline and setups in the future |
Purpose
closes #663
note: we want to use the
slurm_exclusive:
flag to get a GPU that's being used for only the performance job, but this seems not to be working when used in AtmosNeed to use ClimaAtmos#fdf1df4 commit to include PR containing atmos config files. Can't use main because of dependency incompatibilities.
Content
Status
2 GPU error is preventing us from getting scaling results - see #687. This needs to be addressed before we can set up reliable scaling runs