Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make GC deterministic in distributed #821

Merged
merged 4 commits into from
Oct 3, 2022
Merged

make GC deterministic in distributed #821

merged 4 commits into from
Oct 3, 2022

Conversation

simonbyrne
Copy link
Member

@simonbyrne simonbyrne commented Sep 7, 2022

PULL REQUEST

Purpose and Content

This should reduce MPI Waitall time by manually triggering the GC across all processes at the same time.

Benefits and Risks

The number of steps will require some tuning to avoid out-of-memory errors

Linked Issues

PR Checklist

  • This PR has a corresponding issue OR is linked to an SDI.
  • I have followed CliMA's codebase contribution and style guidelines OR N/A.
  • I have followed CliMA's documentation policy.
  • I have checked all issues and PRs and I certify that this PR does not duplicate an open PR.
  • I linted my code on my local machine prior to submission OR N/A.
  • Unit tests are included OR N/A.
  • Code used in an integration test OR N/A.
  • All tests ran successfully on my local machine OR N/A.
  • All classes, modules, and function contain docstrings OR N/A.
  • Documentation has been added/updated OR N/A.

@simonbyrne simonbyrne marked this pull request as draft September 7, 2022 22:33
@simonbyrne
Copy link
Member Author

@simonbyrne simonbyrne force-pushed the sb/gc branch 2 times, most recently from 75610ea to 8dac62c Compare September 13, 2022 17:56
@simonbyrne simonbyrne force-pushed the sb/gc branch 2 times, most recently from 4ae134e to 8fd22df Compare September 21, 2022 23:23
@simonbyrne
Copy link
Member Author

I've added a small check so that GC is only called if free memory falls below a certain threshold

@simonbyrne simonbyrne marked this pull request as ready for review October 2, 2022 21:49
@simonbyrne
Copy link
Member Author

It appears to boost scaling considerably, I think it's ready to go. Here are the low-res scaling benchmarks

Before:
low-resolution_sypd-old

After:
low-resolution_sypd-new

Copy link
Member

@charleskawczynski charleskawczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@simonbyrne
Copy link
Member Author

bors r+

@bors
Copy link
Contributor

bors bot commented Oct 3, 2022

Build succeeded:

@bors bors bot merged commit 9d5d78e into main Oct 3, 2022
@bors bors bot deleted the sb/gc branch October 3, 2022 00:53
bors bot added a commit that referenced this pull request Oct 5, 2022
891: Revert "make GC deterministic in distributed" r=szy21 a=szy21

revert #821, which may cause the failure in mpi jobs.

Co-authored-by: jiahe23 <[email protected]>
bors bot added a commit that referenced this pull request Oct 5, 2022
891: Revert "make GC deterministic in distributed" r=szy21 a=szy21

revert #821, which may cause the failure in mpi jobs.

Co-authored-by: jiahe23 <[email protected]>
bors bot added a commit that referenced this pull request Oct 5, 2022
891: Revert "make GC deterministic in distributed" r=szy21 a=szy21

revert #821, which may cause the failure in mpi jobs.

Co-authored-by: jiahe23 <[email protected]>
@simonbyrne simonbyrne mentioned this pull request Oct 6, 2022
3 tasks
simonbyrne added a commit that referenced this pull request Oct 14, 2022
Adds GC callbacks. Behavior can be controlled by an environment variable
CLIMAATMOS_GC_NSTEPS (default = every 1000 steps).

Unlike #821, this doesn't disable the GC, but by manually calling it we should be able to
avoid the runtime needing to call it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants