-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from IARCbioinfo/dev
Dev
- Loading branch information
Showing
10 changed files
with
463 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
version: 2 | ||
|
||
jobs: | ||
build: | ||
machine: true | ||
steps: | ||
- checkout | ||
- run: cd ~ ; wget -qO- get.nextflow.io | bash ; chmod 755 nextflow ; sudo ln -s ~/nextflow /usr/local/bin/ ; sudo apt-get install graphviz | ||
- run: cd ~ && git clone https://github.com/iarcbioinfo/data_test.git | ||
- run: echo " docker.runOptions = '-u $(id -u):$(id -g)' " > ~/.nextflow/config | ||
- run: cd ~/project/ ; docker build -t iarcbioinfo/bqsr-nf . | ||
- run: cd ; nextflow run ~/project/BQSR.nf --help | ||
- run: cd ; nextflow run ~/project/BQSR.nf -with-docker iarcbioinfo/bqsr-nf --input_folder ~/data_test/BAM/ --output_folder BAM_bqsr --cpu 2 --mem 4 --snp_vcf ~/data_test/REF/dbsnp_138.17_7572000-7591000.vcf.gz --indel_vcf ~/data_test/REF/1000G_phase1.indels.17_7572000-7591000.sites.vcf.gz --ref ~/data_test/REF/17_7572000-7591000.fasta -with-dag dag_bqsr.png | ||
- run: cd ; nextflow run ~/project/BQSR.nf -with-docker iarcbioinfo/bqsr-nf --input_folder ~/data_test/BAM/ --output_folder BAM_bqsr --cpu 2 --mem 4 --snp_vcf ~/data_test/REF/dbsnp_138.17_7572000-7591000.vcf.gz --indel_vcf ~/data_test/REF/1000G_phase1.indels.17_7572000-7591000.sites.vcf.gz --ref ~/data_test/REF/17_7572000-7591000.fasta -with-dag dag_STAR_bqsr.html | ||
- run: cd ; cp ~/dag* ~/project/. | ||
- deploy: | ||
branch: [master, dev] | ||
command: chmod +x deploy.sh && ./deploy.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
################## BASE IMAGE ##################### | ||
FROM nfcore/base | ||
|
||
|
||
################## METADATA ####################### | ||
|
||
LABEL base_image="nfcore/base" | ||
LABEL version="1.0" | ||
LABEL software="bqsr-nf" | ||
LABEL software.version="2.0" | ||
LABEL about.summary="Container image containing all requirements for bqsr-nf" | ||
LABEL about.home="http://github.com/IARCbioinfo/BQSR-nf" | ||
LABEL about.documentation="http://github.com/IARCbioinfo/BQSR-nf/README.md" | ||
LABEL about.license_file="http://github.com/IARCbioinfo/BQSR-nf/LICENSE.txt" | ||
LABEL about.license="GNU-3.0" | ||
|
||
################## MAINTAINER ###################### | ||
MAINTAINER **nalcala** <**[email protected]**> | ||
|
||
################## INSTALLATION ###################### | ||
COPY environment.yml / | ||
RUN conda env create -n bqsr-nf -f /environment.yml && conda clean -a | ||
ENV PATH /opt/conda/envs/bqsr-nf/bin:$PATH |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,79 @@ | ||
# BQSR-nf | ||
Nextflow script for base quality score recalibration of bam files using GATK | ||
|
||
## Nextflow pipeline for base quality score recalibration with GATK processing | ||
[](https://circleci.com/gh/IARCbioinfo/BQSR-nf/tree/master) | ||
[](https://hub.docker.com/r/iarcbioinfo/bqsr-nf/) | ||
|
||
## Decription | ||
|
||
Nextflow pipeline for base quality score recalibration and quality control | ||
|
||
## Dependencies | ||
|
||
1. Nextflow: for common installation procedures see the [IARC-nf](https://github.com/IARCbioinfo/IARC-nf) repository. | ||
|
||
2. [*multiQC*](http://multiqc.info/docs/) | ||
3. [*GATK4*](https://software.broadinstitute.org/gatk/guide/quickstart) must be in the PATH variable | ||
4. [GATK bundle](https://software.broadinstitute.org/gatk/download/bundle) VCF files with lists of indels and SNVs (recommended: 1000 genomes indels, dbsnp VCF) | ||
|
||
You can provide a config file to customize the multiqc report (see https://multiqc.info/docs/#configuring-multiqc). | ||
|
||
## Input | ||
| Type | Description | | ||
|-----------|---------------| | ||
|--input_folder | a folder with bam files | | ||
|
||
|
||
## Parameters | ||
|
||
* #### Mandatory | ||
| Name | Example value | Description | | ||
|-----------|--------------:|-------------| | ||
|--ref | ref.fa | reference genome fasta file for GATK | | ||
|
||
* #### Optional | ||
|
||
| Name | Default value | Description | | ||
|-----------|--------------|-------------| | ||
|--cpu | 2 | number of CPUs | | ||
|--mem | 32 | memory for mapping| | ||
|--output_folder | . | output folder for aligned BAMs| | ||
|--snp_vcf | dbsnp.vcf | VCF file with known variants for GATK BQSR | | ||
|--indel_vcf | Mills_100G_indels.vcf | VCF file with known indels for GATK BQSR | | ||
|--multiqc_config | null | config yaml file for multiqc | | ||
|
||
* #### Flags | ||
|
||
| Name | Description | | ||
|-----------|-------------| | ||
|--help | print usage and optional parameters | | ||
|
||
## Usage | ||
To run the pipeline on a series of bam files in folder *bam*, a reference genome with indexes at *ref.fa*, and known snps and indels from the gatk bundle, one can type: | ||
```bash | ||
nextflow run iarcbioinfo/BQSR-nf --input_folder bam --ref ref.fa --snp_vcf GATK_bundle/dbsnp_146.hg38.vcf.gz --indel_vcf GATK_bundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz | ||
``` | ||
|
||
## Output | ||
| Type | Description | | ||
|-----------|---------------| | ||
| BAM/file.bam | BAM files of alignments or realignments | | ||
| BAM/file.bam.bai | BAI files of alignments or realignments | | ||
| QC/multiqc_BQSR_report.html | multiqc report | | ||
| QC/multiqc_BQSR_report_data | folder with data used to compute multiqc report | | ||
| QC/BAM/BQSR/file_recal.table | table of scores before recalibration | | ||
| QC/BAM/BQSR/file_post_recal.table | table of scores after recalibration | | ||
| QC/BAM/BQSR/file_recalibration_plots.pdf | before/after recalibration plots | | ||
The output_folder directory contains two subfolders: BAM and QC | ||
|
||
## Directed Acyclic Graph | ||
|
||
[](http://htmlpreview.github.io/?https://github.com/IARCbioinfo/BQSR-nf/blob/dev/dag_BQSR.html) | ||
|
||
## Contributions | ||
|
||
| Name | Email | Description | | ||
|-----------|---------------|-----------------| | ||
| Nicolas Alcala* | [email protected] | Developer to contact for support | |
Oops, something went wrong.