Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timing tests with and without Podman/MPI #204

Open
moustakas opened this issue Dec 31, 2024 · 2 comments
Open

timing tests with and without Podman/MPI #204

moustakas opened this issue Dec 31, 2024 · 2 comments

Comments

@moustakas
Copy link
Member

moustakas commented Dec 31, 2024

@jdbuhler @sbailey

Following up on #203, here are some timing tests with main, with and without parallelization, and with different underlying software stacks. All of these tests were carried out in a single interactive node,

salloc -N 1 -C cpu -A desi -t 04:00:00 --qos interactive

Once a decision is made on which software stack to use, I'll do some more extensive benchmarking to determine the appropriate number of MPI tasks.

Edit: the desiconda times (tests 5 and 6) were updated to include the numba cache.

Prefix Number of Targets Number of MPI tasks Time Code Notes Command
timing-test1 1 1 26.7 s Podman container desihub/fastspecfit:3.1.1 No mkl_fft. time srun --ntasks=1 podman-hpc run --rm --mpi --group-add keep-groups --volume=/dvs_ro/cfs/cdirs:/dvs_ro/cfs/cdirs --volume=/global/cfs/cdirs:/global/cfs/cdirs --volume=$PSCRATCH:/scratch desihub/fastspecfit:3.1.1 mpi-fastspecfit --outdir-data=/scratch/timing-test1 --mp=1 --survey=sv1 --program=bright --healpix=22746 --targetids=39627671176481414 --profile
timing-test2 1 1 24.4 s Podman container desihub/fastspecfit:3.1.1b With mkl_fft; crashes. time srun --ntasks=1 podman-hpc run --rm --mpi --group-add keep-groups --volume=/dvs_ro/cfs/cdirs:/dvs_ro/cfs/cdirs --volume=/global/cfs/cdirs:/global/cfs/cdirs --volume=$PSCRATCH:/scratch desihub/fastspecfit:3.1.1b mpi-fastspecfit --outdir-data=/scratch/timing-test2 --mp=1 --survey=sv1 --program=bright --healpix=22746 --targetids=39627671176481414 --profile
timing-test5 1 1 28.9 s desiconda/main and fastspecfit/main No mkl_fft (see this ticket). time srun --ntasks=1 mpi-fastspecfit --outdir-data=$PSCRATCH/timing-test5 --mp=1 --survey=sv1 --program=bright --healpix=22746 --targetids=39627671176481414 --profile
timing-test3 144 32 156.5 s (104.7 s with 128 MPI ranks) Podman container desihub/fastspecfit:3.1.1 No mkl_fft. time srun --ntasks=32 podman-hpc run --rm --mpi --group-add keep-groups --volume=/dvs_ro/cfs/cdirs:/dvs_ro/cfs/cdirs --volume=/global/cfs/cdirs:/global/cfs/cdirs --volume=$PSCRATCH:/scratch desihub/fastspecfit:3.1.1 mpi-fastspecfit --outdir-data=/scratch/timing-test3 --survey=sv1 --program=bright --healpix=22746 --mp=32 --profile
timing-test4 144 32 142.6 s (95.5 s with 128 MPI ranks) Podman container desihub/fastspecfit:3.1.1b With mkl_fft; crashes. time srun --ntasks=32 podman-hpc run --rm --mpi --group-add keep-groups --volume=/dvs_ro/cfs/cdirs:/dvs_ro/cfs/cdirs --volume=/global/cfs/cdirs:/global/cfs/cdirs --volume=$PSCRATCH:/scratch desihub/fastspecfit:3.1.1b mpi-fastspecfit --outdir-data=/scratch/timing-test4 --survey=sv1 --program=bright --healpix=22746 --mp=32 --profile
timing-test6 144 32 165.5 s (118.9 s with 128 MPI ranks) desiconda/main and fastspecfit/main No mkl_fft (see this ticket). time srun --ntasks=32 mpi-fastspecfit --outdir-data=$PSCRATCH/timing-test6 --mp=32 --survey=sv1 --program=bright --healpix=22746 --profile

Note: desihub/fastspecfit:3.1.1 and desihub/fastspecfit:3.1.1b are nearly identical Podman containers which exclude and include, respectively, mkl_fft. Currently, container desihub/fastspecfit:3.1.1b crashes due to an obscure error:

Fatal Python error: _Py_GetConfig: the function must be called with the GIL held, after Python initialization 
and before Python finalization, but the GIL is released (the current Python thread state is NULL) 
Python runtime state: finalizing (tstate=0x00005622c660ae50)

This issue has been tracked down to mkl_fft and is being tracked in NERSC ticket INC0228131.

@moustakas
Copy link
Member Author

A new / different version of timing-test3 (with Podman container desihub/fastspecfit:3.1.1, i.e., no mkl_fft) with 1 MPI rank and 32 multiprocessing cores resulted in a time of 119 seconds, 25% faster than the pure-MPI version of the equivalent code, which was 156.5 seconds (and 64.3 seconds compared to 104.7 seconds with a factor of 128 parallelism, 40% faster).

So these results that relying exclusively on MPI is actually detrimental (for FastSpecFit at least).

I'm going to benchmark a mix of MPI ranks and mp cores on a fixed dataset and see what combination works best.

@moustakas
Copy link
Member Author

Well, the Podman experiment was, in the end, not viable at-scale; every day brought a different error trying to use a container at NERSC.

So here are timing tests with the DESI software stack, N=2 nodes, and a variable mix of number of MPI tasks (ntasks) and multiprocessing cores (mp), run on a test sample of ~70,000 targets spread across 111 redrock files and all surveys and programs:

Job ID N ntasks mp Time
34596770 2 32 8 00:27:12
34596763 2 16 16 00:31:21
34596773 2 8 32 00:32:16
34596771 2 4 64 00:39:27
34596767 2 2 128 01:02:26
mpi-fastspecfit --outdir-data /pscratch/sd/i/ioannis/fastspecfit/loa-nXmpX --mp=32 --plan \
  --healpix=2240,2241,16297,16298,26031,4321,5957,5958,41936,41952,11165,11167,25598,25599,9512,9514,26278,11093,8954,8955,31624,31625,6638,6640,40855,40861

INFO:mpi.py:281:plan: Found 111/111 redrockfiles left.
INFO:mpi.py:329:plan: Skipping 14 redrockfiles with no targets.
INFO:mpi.py:334:plan: Number of targets left: 70,168.
INFO:mpi.py:346:plan:    main:bright: 22 redrockfiles, 18,402 targets
INFO:mpi.py:346:plan:      main:dark: 19 redrockfiles, 28,557 targets
INFO:mpi.py:346:plan:    main:backup: 13 redrockfiles, 50 targets
INFO:mpi.py:346:plan:  special:bright: 7 redrockfiles, 900 targets
INFO:mpi.py:346:plan:   special:dark: 2 redrockfiles, 275 targets
INFO:mpi.py:346:plan:      cmx:other: 2 redrockfiles, 458 targets
INFO:mpi.py:346:plan:     sv1:bright: 5 redrockfiles, 1,090 targets
INFO:mpi.py:346:plan:       sv1:dark: 3 redrockfiles, 730 targets
INFO:mpi.py:346:plan:      sv1:other: 1 redrockfiles, 1 targets
INFO:mpi.py:346:plan:     sv2:bright: 3 redrockfiles, 1,430 targets
INFO:mpi.py:346:plan:       sv2:dark: 4 redrockfiles, 1,421 targets
INFO:mpi.py:346:plan:     sv2:backup: 2 redrockfiles, 23 targets
INFO:mpi.py:346:plan:     sv3:bright: 5 redrockfiles, 5,359 targets
INFO:mpi.py:346:plan:       sv3:dark: 5 redrockfiles, 11,403 targets
INFO:mpi.py:346:plan:     sv3:backup: 4 redrockfiles, 69 targets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant