-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom config for seadragon cluster of MD Anderson Cancer Center #831
base: master
Are you sure you want to change the base?
Conversation
@nf-core-bot fix linting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only blocker is the missing resourceLimits
options rest are suggestions
} | ||
|
||
env { | ||
SINGULARITY_CACHEDIR="/home/$USER/.singularity/cache" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this works, e.g. using --custom_config_base
with your pipeline of interest?
For some config scopes Nextflow will interpret that as a Nextflow variable rather than base. I think env
is OK though, but likely worth to double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi James, thank you for the comment. I have tested the config with the RNA seq and Sarek pipeline. Neither of them reported errors here. So it should be fine.
envWhitelist='APPTAINERENV_NXF_TASK_WORKDIR,APPTAINERENV_NXF_DEBUG,APPTAINERENV_LD_LIBRARY_PATH,SINGULARITY_BINDPATH,LD_LIBRARY_PATH,TMPDIR,SINGULARITY_TMPDIR' | ||
autoMounts = true | ||
runOptions = '-B ${TMPDIR:-/tmp}' | ||
cacheDir = "/home/$USER/.singularity/cache" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above
conf/seadragon.config
Outdated
max_memory = 950.GB // Maximum memory based on E80 nodes | ||
max_cpus = 80 // Maximum CPUs based on E80 nodes | ||
max_time = 240.h // Maximum runtime for long queues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this would mean that you will never be able to access the high-mem nodes.
Furthermor,e you need to replicate these values in the nextflow native resourceLimits
directive - e.g.
Lines 12 to 21 in 03af77c
process { | |
resourceLimits = [ | |
memory: 1992.GB, | |
cpus: 128, | |
time: 168.h | |
] | |
executor = 'slurm' | |
queue = 'qbic' | |
scratch = 'true' | |
} |
max_*
have been deprecated in more recent nf-core pipelines, but you should still keep max_
for backards compatibility with older pipelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uploaded the max_memory to make highmem and vhighmem nodes available
|
||
## Notes | ||
|
||
- **Data Storage**: All intermediate files will be stored in the `work/` directory within the job's launch directory. These files can consume significant space, so it is recommended to delete this directory after the pipeline completes successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could instead add cleanup = true
to the config, so all files in this directory get deleted when a run completes successfully (if it fails, the intermediate files don't get deleted)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.
Line 8 in 03af77c
cleanup = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi James, thanks for pointing that out. The reason I keep the work folder is that I may need to use the BAM files from the Sarek pipeline for other custom analyses. Is there a more elegant way to retain the BAM files? Perhaps by custom-defining them as final outputs?
Co-authored-by: James A. Fellows Yates <[email protected]>
name: New Custom Config For Seadragon
about: A new cluster config For MD Anderson Cancer Center Seadragon Cluster
Please follow these steps before submitting your PR:
[WIP]
in its titlemaster
branchSteps for adding a new config profile:
conf/
directorydocs/
directorynfcore_custom.config
file in the top-level directoryREADME.md
file in the top-level directoryprofile:
scope in.github/workflows/main.yml
.github/CODEOWNERS
(**/<custom-profile>** @<github-username>
)