Skip to content

Switch from Multi Threading to Multi Processing#206

Draft
r-sharp wants to merge 16 commits intoMetOffice:mainfrom
r-sharp:switch_to_multiprocessing
Draft

Switch from Multi Threading to Multi Processing#206
r-sharp wants to merge 16 commits intoMetOffice:mainfrom
r-sharp:switch_to_multiprocessing

Conversation

@r-sharp
Copy link
Contributor

@r-sharp r-sharp commented Mar 6, 2026

PR Summary

Sci/Tech Reviewer:
Code Reviewer:

Some performance tests on the original code, used to scan a full UM clone, revealed that altering the maximum number of workers anywhere between 1 and 64 gave no noticaable performance improvement whatsoever. Whils only a little imrpvement at low thread counts it was a little surprising to see nothing whatsoever.

However, there had always been an intention to switch to using multiple processes, as the tasks of opening files and scanning the contents was likely to work well with completely independant tasks on different processes.

Initial tests of this change have demonstrated timing improvements on the VDI when using 2 processors :

umdp3_checker_timings/2nd_draft_mulitiprocessor_01_time.txt : real 2m38.319s
umdp3_checker_timings/2nd_draft_mulitiprocessor_02_time.txt : real 1m29.531s
umdp3_checker_timings/2nd_draft_mulitiprocessor_03_time.txt : real 1m8.422s
umdp3_checker_timings/2nd_draft_mulitiprocessor_04_time.txt : real 1m5.695s

Where the numbers in the file names indicate the maximum number of workers specified at runtime.

Also when submitted to 16 processors on SPICE using salloc --time=30 --mem=8G --ntasks=16 --x11 --bell reasonable scaling occurs up to the 16 processors requested.

umdp3_checker_timings/SPICE_16proc_UM_01_time.txt : real 2m41.835s
umdp3_checker_timings/SPICE_16proc_UM_02_time.txt : real 1m8.262s
umdp3_checker_timings/SPICE_16proc_UM_04_time.txt : real 0m35.004s
umdp3_checker_timings/SPICE_16proc_UM_08_time.txt : real 0m18.225s
umdp3_checker_timings/SPICE_16proc_UM_16_time.txt : real 0m13.493s
umdp3_checker_timings/SPICE_16proc_UM_32_time.txt : real 0m14.478s

Where the numbers in the file names indicate the maximum number of workers specified at runtime.
All timings were simply gathered with bash buit-in time function.

Based on these results, it is also presumed that setting the default maximum number of workers to '2', anticipating use on the VDI is probably best. Automated use, such as within rose-stem or as a GitHub action can specify other values more suitable for those environments.

Code Quality Checklist

  • I have performed a self-review of my own code
  • My code follows the project's style guidelines
  • Comments have been included that aid understanding and enhance the readability of the code
  • My changes generate no new warnings
  • All automated checks in the CI pipeline have completed successfully

Testing

  • This change has been tested appropriately (please describe)

Run on command line on both VDI and SPICE to test a full clone of the UM and also a UM branch.
R3esults were as expected, performance on 2 or more processors was improved.

Security Considerations

  • I have reviewed my changes for potential security issues
  • Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance of Generative AI tool name (e.g., Met Office Github Copilot Enterprise, Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the Simulation Systems AI policy (including attribution labels)

AI has been used for line completion. Curiously, quite a bit of what it suggested (Using multiprocessing.Pool instead of ThreadPoolExecuter) was taken back out to make the code easier to follow and didn't affect performance.

Sci/Tech Review

  • I understand this area of code and the changes being added
  • The proposed changes correspond to the pull request description
  • Documentation is sufficient (do documentation papers need updating)
  • Sufficient testing has been completed

(Please alert the code reviewer via a tag when you have approved the SR)

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Security considerations have been addressed
  • Performance impact is acceptable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant