Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GNU parallel to run consequences processing in parallel #61

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

anthonyfok
Copy link
Member

Taking advantage of multiple CPU cores, multiple python3 instances are
dispatched simultaneously using "GNU parallel" in run_OQStandard.sh
for consequences calculations.

Using bash scripts/run_OQStandard.sh SCM5p8_Montreal_conv -h -r -d -o
as example, with each realization taking 82 minutes, doing 16 realizations
in parallel instead of in series would save 20.5 hours. As consequences
calculations are done twice, the total run time is reduced by 41 hours,
from 56 hours down to 15 hours on a c5a.24xlarge EC2 instance.

Unlike Python’s own multiprocessing module, GNU parallel’s invocation of
multiple invocations of Python does not involve any memory sharing at all,
which avoids any potential mysterious calculation discrepancy with
Numpy’s OpenBLAS dot multiplications seen in superseded Pull Request #58.

Fixes #57


Hey @tieganh and @jeremyrimando,

All comments welcome!

@jvanulde
Copy link
Contributor

@anthonyfok any value in sharing these findings with @micheles at GEM?

@anthonyfok
Copy link
Member Author

anthonyfok commented Sep 22, 2023

@anthonyfok any value in sharing these findings with @micheles at GEM?

@jvanulde Great idea! Sorry for taking so long to get back to you. I'll try to send an email to Micheles et al. today


2024-01-09 update: Didn't email Micheles directly, but finally opened this following issue:

@anthonyfok anthonyfok force-pushed the use-GNU-parallel-for-consequences-calc branch 2 times, most recently from c480417 to d049efc Compare October 31, 2023 18:53
@anthonyfok anthonyfok marked this pull request as draft October 31, 2023 18:58
Taking advantage of multiple CPU cores, multiple python3 instances are
dispatched simultaneously using "GNU parallel" in run_OQStandard.sh
for consequences calculations.

Using "bash scripts/run_OQStandard.sh SCM5p8_Montreal_conv -h -r -d -o"
as example, with each realization taking 82 minutes, doing 16 realizations
in parallel instead of in series would save 20.5 hours.  As consequences
calculations are done twice, the total run time is reduced by 41 hours,
from 56 hours down to 15 hours on a c5a.24xlarge EC2 instance.

Unlike Python’s own multiprocessing module, GNU parallel’s invocation of
multiple invocations of Python does not involve any memory sharing at all,
which avoids any potential mysterious calculation discrepancy with
Numpy’s OpenBLAS dot multiplications seen in superseded Pull Request OpenDRR#58.

Fixes OpenDRR#57
@anthonyfok anthonyfok force-pushed the use-GNU-parallel-for-consequences-calc branch from d049efc to c4e1600 Compare November 2, 2023 12:02
@anthonyfok anthonyfok self-assigned this Nov 2, 2023
@anthonyfok anthonyfok added this to In progress in Data via automation Nov 2, 2023
@anthonyfok anthonyfok marked this pull request as ready for review November 2, 2023 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Data
In progress
Development

Successfully merging this pull request may close these issues.

Investigate if scripts/consequences-v3.10.0.py could be optimized
2 participants