Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build snpeff database at installation not run time. #78

Open
poquirion opened this issue Apr 28, 2021 · 3 comments
Open

build snpeff database at installation not run time. #78

poquirion opened this issue Apr 28, 2021 · 3 comments

Comments

@poquirion
Copy link
Contributor

poquirion commented Apr 28, 2021

Right now snpeff download its database at run time. And on top of that it installs it in the CONDA_PREFIX folder
There are at least three cases where this will crash the pipeline.

1- When the pipeline is ran on a system with no internet access.
2- When the pipeline is ran in a (read only) container
3- When conda is installed as one user and the pipeline in ran as another user.

@poquirion
Copy link
Contributor Author

poquirion commented Apr 28, 2021

I do it for the version we run at the Genome Center by adding that line at the end of the Dockerfile:

RUN bash -ic '/app/scripts/build_db.py'

and removing rules build_snpeff_db and download_db_files from the workflow/rules/annotation.smk files.

@rdeborja
Copy link
Collaborator

The snakemake file does have a dependency on CONDA_PREFIX for the database as mentioned. The goal was to simplify the process and was setup with conda in mind.

(1) Yes, if there is no internet access and the snpeff db has not been download before hand it will fail.

(2) This would be correct assuming the container was not a copy that already had the snpeff db downloaded.

(3) Does the other user have access to the conda environment or are they completely isolated. If they are isolated, they will need to perform the download independently.

It seems like you installed snpeff outside conda. Is this correct?

@poquirion
Copy link
Contributor Author

poquirion commented Apr 30, 2021

Question,
If the db is downloaded, will the steps be automatically skipped? If yes, then I will just install the db the container in the deployment script. This will make out life on the CC system easier.

Then for your question, it is installed in the conda environment since the RUN bash -ic '/app/scripts/build_db.py will be the last line in the dockerfile and the bash -ic '' force the conda env to be loaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants