Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation about memory vs 'job_mem` could be improved #634

Open
Andrew-S-Rosen opened this issue Mar 6, 2024 · 6 comments · May be fixed by #676
Open

Documentation about memory vs 'job_mem` could be improved #634

Andrew-S-Rosen opened this issue Mar 6, 2024 · 6 comments · May be fixed by #676

Comments

@Andrew-S-Rosen
Copy link
Contributor

Andrew-S-Rosen commented Mar 6, 2024

The SLRUMCluster documentation's description for the memory keyword argument is a bit ambiguous when comparing against job_mem.

We have:

  • memory: Total amount of memory per job
  • job_mem: Amount of memory to request in SLURM. If None, defaults to worker processes * memory

The documentation states that a "job" is a Slurm allocation. So, this leads us to some potentially confusing docs.

@Andrew-S-Rosen Andrew-S-Rosen changed the title Unclear documentation about the memory keyword argument Documentation about memory vs 'job_mem` could be improved Mar 7, 2024
@guillaumeeb
Copy link
Member

You are right, this is the memory documentation that is correct, but not precise enough, this should be something like:

  • memory: Total amount of memory to be used by all workers inside a job. Used by job queuing system by default as amount of memory per job.
  • job_mem: Amount of memory to request in SLURM for each job, If None, defaults to memory.

@maneesh29s
Copy link
Contributor

I agree with the improvements mentioned here.
Similarly, can we modify the description of "cores" and "job_cpu" parameters too?

Something like this:

  • cores: Total number of CPU cores on which all the worker threads inside a job will run. The number of threads per worker process are determined using formula cores / processes. Used by job queuing system by default as amount of CPUs per job.

  • job_cpu: Number of CPUs to request in SLURM for each job. This option is useful to request more CPUs than total number of worker threads, in cases when a dask task might run multi-threaded using OpenMP or other means. If None, defaults to cores.

@maneesh29s
Copy link
Contributor

@guillaumeeb Would like to know your thoughts on the above comment regarding description of CPU parameters.

@guillaumeeb
Copy link
Member

Thanks @maneesh29s for the proposal and sorry for the delay.

Just a small suggestion:
job_cpu: Number of CPUs to request in SLURM for each job. This option might be useful to request more CPUs than total number of worker threads for some complex non Python code. If None, defaults to cores.

@maneesh29s
Copy link
Contributor

I agree with your suggestion.

And I am happy to raise a PR to close this issue.

@guillaumeeb
Copy link
Member

Sure, please do so!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants