Skip to content

Conversation

lpsinger
Copy link
Contributor

The SLURM worker hardcodes --ntasks=1/-n=1, but on the SLURM cluster that I have experience with (SDSC Expanse), you also have to set -N=1/--nodes=1.

The SLURM worker hardcodes --ntasks=1/-n=1, but on the SLURM
cluster that I have experience with (SDSC Expanse), you also have
to set -N=1/--nodes=1.
@guillaumeeb
Copy link
Member

These configurations are always tricky and specific, and I'm not sure this particular option is really needed in most Slurm cluster? The real question actually is: does this option is available and mandatory in most Slurm cluster configuration?

You can always set it with dask-jobqueue kwargs in your context.

@lpsinger
Copy link
Contributor Author

These configurations are always tricky and specific, and I'm not sure this particular option is really needed in most Slurm cluster? The real question actually is: does this option is available and mandatory in most Slurm cluster configuration?

I think it is quite likely that it is available in every Slurm cluster configuration because this option is documented by the sbatch man page. I don't know whether it is mandatory on every cluster, but I also don't see that there would be any harm even if it isn't required. Slurm won't let you set --nodes to a value greater than --ntasks. Here's what happens if you try to do that:

$ srun --pty -A project -p debug --nodes=2 --ntasks=1 --cpus-per-task=1 -t 00:00:05 sleep 1
srun: warning: can't run 1 processes on 2 nodes, setting nnodes to 1
srun: job 38602701 queued and waiting for resources
srun: job 38602701 has been allocated resources

Copy link
Member

@guillaumeeb guillaumeeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I agree this should always be available, so I'm OK to add this if this simplifies things for users.

@guillaumeeb guillaumeeb merged commit 7cbd4b8 into dask:main May 16, 2025
10 of 11 checks passed
@lpsinger
Copy link
Contributor Author

Thank you!

@lpsinger lpsinger deleted the slurm-nodes-1 branch May 16, 2025 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants