Skip to content

Commit

Permalink
1.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
tdayris committed Feb 21, 2024
1 parent a1a744a commit 5d96a76
Show file tree
Hide file tree
Showing 19 changed files with 283 additions and 266 deletions.
36 changes: 28 additions & 8 deletions .test/config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ params:
# Optional parameters for pyfaidx (filter/correct fasta format)
pyfaidx:
# Filter-out non canonical chromosomes
dna: ''
dna: '--regex "^[[0-9]+|X|Y|MT]"'
# Keep all cdna sequences
cdna: ""
# Optional parameters for agat (filter/correct GTF format)
Expand All @@ -24,14 +24,24 @@ params:
bedtools:
# Optional parameters for filtering non-canonical chromosomes over dbSNP
filter_non_canonical_chrom: ""
# Optional parameters for bedtools merge, when merging blacklisted regions
merge: ""
# Optional parameters for tabix index VCF
tabix: "-p vcf"
# Optional parameters for FastQ Screen
fastq_screen:
# Number of reads processed
subset: 100000
# Aligner used to index dna sequences
aligner: bowtie2
# Path to configuration file
fastq_screen_config: ""
# Optional parameters for fastp
fastp:
# Optional adapters to remove
adapters: ""
# Optional command line arguments for fastp
extra: ""
extra: "--verbose --overrepresentation_analysis"
# Optional parameters for fastqc
fastqc: ""
# Optional parameters for bowtie2
Expand All @@ -42,20 +52,30 @@ params:
align: ""
sambamba:
# Optional parameters for sambamba view
view: "--format 'bam' "
view: "--format 'bam' --filter 'mapping_quality >= 30 and not (unmapped or mate_is_unmapped)' "
# Optional parameters for sambamba markdup
markdup: " --overflow-list-size=500000"
markdup: "--remove-duplicates --overflow-list-size=500000"
picard:
# Mapping QC optional parameters
metrics: ""
# Optional parameters for picard create sequence dictionary
createsequencedictionary: ""
# Optional parameters for collect multiple metrics
collectmultiplemetrics: ""
# Optional parameters for samtools stats
samtools:
# Optional parameters for samtools fasta index
faidx: ""
# Optional parameters for samtools stats
stats: ""
# Optional parameters for wget
wget: "--verbose"
# Optional parameters for rsync
rsync: "--verbose --checksum --force --human-readable --progress"
# Optional parameters for pyroe
pyroe:
# Optional parameters for ID 2 name
idtoname: ""
# Optional parameters for multiqc
multiqc: "--module gatk --module bcftools --module picard --module fastqc --module fastp --module samtools --module bowtie2 --module sambamba --zip-data-dir --verbose --no-megaqc-upload --no-ansi --force"
# Optional parameters for GATK
Expand All @@ -69,7 +89,7 @@ params:
# Optional parameters for GATK learnreadorientationmodel
learnreadorientationmodel: ""
# Optional parameters for GATK filtermutectcalls
filtermutectcalls: "--create-output-variant-index "
filtermutectcalls: "--create-output-variant-index --min-median-mapping-quality 35 --max-alt-allele-count 3"
# Optional parameters for GATK varianteval
varianteval: ""
# Optional parameters for BCFTools
Expand All @@ -83,10 +103,10 @@ genomes: config/genomes.csv
# Internal use only, not described in documentation.
# deactivate import of fair_genome_indexer pipeline.
# Requires the file `genome.csv` to be filled.
# load_fair_genome_indexer: true
load_fair_genome_indexer_pipeline: true
# Deactivate the import of fair_fastqc_multiqc pipeline.
# Requires to redefine fastqc and multiqc rules.
# load_fair_fastqc_multiqc: true
load_fair_fastqc_multiqc_pipeline: true
# Deactivate the import of fair_bowtie2_mapping.
# Requires to redefine mapping steps.
# load_fair_bowtie2_mapping: true
load_fair_bowtie2_mapping_pipeline: true
3 changes: 2 additions & 1 deletion .test/config/samples.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
sample_id,upstream_file,downstream_file,species,build,release
sac_a,data/reads/a.scerevisiae.1.fq,data/reads/a.scerevisiae.2.fq,saccharomyces_cerevisiae,R64-1-1,110
sac_a,data/reads/a.scerevisiae.1.fq,data/reads/a.scerevisiae.2.fq,saccharomyces_cerevisiae,R64-1-1,110
sac_b,data/reads/b.scerevisiae.1.fq,,saccharomyces_cerevisiae,R64-1-1,110
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
# 1.1.0

## Features:

* temp, log, and benchmark paths rebuild
* rule names changed to follow between-workflows trace
* use of `lookup` instead of hand-made function
* Relies on fair_genome_indexer version 3.1.4
* Relies on fair_bowtie2_mapping version 3.1.0
* Relies on fair_fastqc_multiqc version 2.0.3

## Fixes:

* Documentation update

# 1.0.0

## Features:
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
[![Snakemake](https://img.shields.io/badge/snakemake-≥7.29.0-brightgreen.svg)](https://snakemake.github.io)
[![GitHub actions status](https://github.com/tdayris/fair_gatk_mutect_germline/workflows/Tests/badge.svg?branch=main)](https://github.com/tdayris/fair_gatk_mutect_germline/actions?query=branch%3Amain+workflow%3ATests)
[![GitHub actions status](https://github.com/tdayris/fair_gatk_mutect_germline/workflows/Tests/badge.svg)](https://github.com/tdayris/fair_gatk_mutect_germline/actions?query=branch%3Amain+workflow%3ATests)

Do not use. Active dev.

Snakemake workflow used to call variants with GATK-Mutect2
Snakemake workflow used to call germline variants with GATK-Mutect2

## Usage

Expand Down
30 changes: 26 additions & 4 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,24 @@ params:
bedtools:
# Optional parameters for filtering non-canonical chromosomes over dbSNP
filter_non_canonical_chrom: ""
# Optional parameters for bedtools merge, when merging blacklisted regions
merge: ""
# Optional parameters for tabix index VCF
tabix: "-p vcf"
# Optional parameters for FastQ Screen
fastq_screen:
# Number of reads processed
subset: 100000
# Aligner used to index dna sequences
aligner: bowtie2
# Path to configuration file
fastq_screen_config: "/mnt/beegfs/database/bioinfo/Index_DB/Fastq_Screen/0.14.0/fastq_screen.conf"
# Optional parameters for fastp
fastp:
# Optional adapters to remove
adapters: ""
# Optional command line arguments for fastp
extra: ""
extra: "--verbose --overrepresentation_analysis"
# Optional parameters for fastqc
fastqc: ""
# Optional parameters for bowtie2
Expand All @@ -50,13 +60,25 @@ params:
metrics: ""
# Optional parameters for picard create sequence dictionary
createsequencedictionary: ""
# Optional parameters for collect multiple metrics
collectmultiplemetrics: ""
# Optional parameters for samtools stats
samtools:
# Optional parameters for samtools fasta index
faidx: ""
# Optional parameters for samtools stats
stats: ""
# Optional parameters for multiqc
multiqc: "--module picard --module fastqc --module fastp --module samtools --module bowtie2 --module sambamba --zip-data-dir --verbose --no-megaqc-upload --no-ansi --force"
# Optional parameters for wget
wget: "--verbose"
# Optional parameters for rsync
rsync: "--verbose --checksum --force --human-readable --progress"
# Optional parameters for pyroe
pyroe:
# Optional parameters for ID 2 name
idtoname: ""
# Optional parameters for multiqc
multiqc: "--module gatk --module bcftools --module picard --module fastqc --module fastp --module samtools --module bowtie2 --module sambamba --zip-data-dir --verbose --no-megaqc-upload --no-ansi --force"
# Optional parameters for GATK
gatk:
Expand All @@ -83,10 +105,10 @@ genomes: config/genomes.csv
# Internal use only, not described in documentation.
# deactivate import of fair_genome_indexer pipeline.
# Requires the file `genome.csv` to be filled.
# load_fair_genome_indexer: true
load_fair_genome_indexer_pipeline: true
# Deactivate the import of fair_fastqc_multiqc pipeline.
# Requires to redefine fastqc and multiqc rules.
# load_fair_fastqc_multiqc: true
load_fair_fastqc_multiqc_pipeline: true
# Deactivate the import of fair_bowtie2_mapping.
# Requires to redefine mapping steps.
# load_fair_bowtie2_mapping: true
load_fair_bowtie2_mapping_pipeline: true
6 changes: 3 additions & 3 deletions workflow/Snakefile
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
include: "rules/common.smk"


if config.get("load_fair_genome_indexer", True):
if lookup(dpath="load_fair_genome_indexer_pipeline", within=config):

include: "rules/fair_genome_indexer_pipeline.smk"


if config.get("load_fair_bowtie2_mapping", True):
if lookup(dpath="load_fair_bowtie2_mapping_pipeline", within=config):

include: "rules/fair_bowtie2_mapping_pipeline.smk"


if config.get("load_fair_fastqc_multiqc", True):
if lookup(dpath="load_fair_fastqc_multiqc_pipeline", within=config):

include: "rules/fair_fastqc_multiqc_pipeline.smk"

Expand Down
14 changes: 14 additions & 0 deletions workflow/envs/workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- apptainer
- snakemake
- black
- snakefmt
- ipython
- pytest
- snakemake-executor-plugin-cluster-generic
- snakemake-executor-plugin-slurm
- mamba
22 changes: 11 additions & 11 deletions workflow/report/material_methods.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,23 +46,23 @@ installation usage, and resutls can be found on the
.. [#multiqcpaper] Ewels, Philip, et al. "MultiQC: summarize analysis results for multiple tools and samples in a single report." Bioinformatics 32.19 (2016): 3047-3048.
.. [#snakemakepaper] Köster, Johannes, and Sven Rahmann. "Snakemake—a scalable bioinformatics workflow engine." Bioinformatics 28.19 (2012): 2520-2522.
.. _Sambamba: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/sambamba.html
.. _Bowtie2: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/bowtie2.html
.. _Fastp: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/fastp.html
.. _Picard: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/picard/collectmultiplemetrics.html
.. _MultiQC: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/multiqc.html
.. _Sambamba: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/sambamba.html
.. _Bowtie2: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/bowtie2.html
.. _Fastp: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/fastp.html
.. _Picard: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/picard/collectmultiplemetrics.html
.. _MultiQC: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/multiqc.html
.. _Snakemake: https://snakemake.readthedocs.io
.. _Github: https://github.com/tdayris/fair_bowtie2_mapping
.. _`Snakemake workflow`: https://snakemake.github.io/snakemake-workflow-catalog?usage=tdayris/fair_bowtie2_mapping
.. _Agat: https://agat.readthedocs.io/en/latest/index.html
.. _Samtools: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/samtools/faidx.html
.. _FastQC: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/fastqc.html
.. _Samtools: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/samtools/faidx.html
.. _FastQC: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/fastqc.html
.. _Pyfaidx: https://github.com/mdshw5/pyfaidx
.. _GATK: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/gatk.html
.. _GATK: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/gatk.html
.. _`GATK Best practices`: https://gatk.broadinstitute.org/hc/en-us/articles/360035894711-About-the-GATK-Best-Practices
.. _Mutect2: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/gatk/mutect.html
.. _VariantEval: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/gatk/varianteval.html
.. _BCFTools: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/bcftools/stats.html
.. _Mutect2: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/gatk/mutect.html
.. _VariantEval: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/gatk/varianteval.html
.. _BCFTools: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/bcftools/stats.html

:Authors:
Thibault Dayris
Expand Down
14 changes: 7 additions & 7 deletions workflow/report/multiqc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ Samtools_, FastQC_, Bowtie2_, and `GATK VariantEval`_. It is a stand-alone file,
and can be opened in your favorite web-browser.

.. _HTML: https://en.wikipedia.org/wiki/HTML
.. _Fastp: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/fastp.html
.. _Bowtie2: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/bowtie2/align.html
.. _FastQC: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/fastqc.html
.. _Stats: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/samtools/stats.html
.. _Picard: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/picard/collectmultiplemetrics.html
.. _Samtools: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/samtools/stats.html
.. _`GATK VariantEval`: https://snakemake-wrappers.readthedocs.io/en/v3.3.3/wrappers/gatk/varianteval.html
.. _Fastp: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/fastp.html
.. _Bowtie2: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/bowtie2/align.html
.. _FastQC: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/fastqc.html
.. _Stats: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/samtools/stats.html
.. _Picard: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/picard/collectmultiplemetrics.html
.. _Samtools: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/samtools/stats.html
.. _`GATK VariantEval`: https://snakemake-wrappers.readthedocs.io/en/v3.3.6/wrappers/gatk/varianteval.html
3 changes: 2 additions & 1 deletion workflow/report/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ and ignore the rest of this documentation.
::

# Activate conda environment
conda activate /mnt/beegfs/pipelines/unofficial-snakemake-wrappers/shared_install/bigr_epicure_pipeline/
# Use latest version available, eg:
conda activate /mnt/beegfs/pipelines/unofficial-snakemake-wrappers/shared_install/snakemake_v8.4.11

# Deploy workflow with the version of your choice
snakedeploy deploy-workflow \
Expand Down
16 changes: 8 additions & 8 deletions workflow/rules/bcftools.smk
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
rule bcftools_mutect2_stats:
rule fair_gatk_mutect_germline_bcftools_mutect2_stats:
input:
"results/{species}.{build}.{release}.{datatype}/VariantCalling/Raw/{sample}.vcf.gz",
"results/{species}.{build}.{release}.{datatype}/VariantCalling/Raw/{sample}.vcf.gz.tbi",
"results/{species}.{build}.{release}.{datatype}/VariantCalling/Germline/{sample}.vcf.gz",
"results/{species}.{build}.{release}.{datatype}/VariantCalling/Germline/{sample}.vcf.gz.tbi",
output:
temp(
"tmp/bcftools/stats/mutect2/germline/{species}.{build}.{release}.{datatype}/{sample}.stats.txt"
"tmp/fair_gatk_mutect_germline/bcftools_mutect2_stats/{species}.{build}.{release}.{datatype}/{sample}.stats.txt"
),
threads: 2
resources:
mem_mb=lambda wildcards, attempt: (1024 * 7) * attempt,
runtime=lambda wildcards, attempt: 30 * attempt,
tmpdir="tmp",
log:
"logs/bcftools/stats/{species}.{build}.{release}.{datatype}/{sample}.gatk.mutect2.germline.log",
"logs/fair_gatk_mutect_germline/bcftools_mutect2_stats/{species}.{build}.{release}.{datatype}/{sample}.log",
benchmark:
"benchmark/bcftools/stats/{species}.{build}.{release}.{datatype}/{sample}.gatk.mutect2.germline.tsv"
"benchmark/fair_gatk_mutect_germline/bcftools_mutect2_stats/{species}.{build}.{release}.{datatype}/{sample}.tsv"
params:
config.get("params", {}).get("bcftools", {}).get("stats", ""),
lookup(dpath="params/bcftools/stats", within=config),
wrapper:
"v3.3.3/bio/bcftools/stats"
"v3.3.6/bio/bcftools/stats"
Loading

0 comments on commit 5d96a76

Please sign in to comment.