Skip to content

Releases: genomic-medicine-sweden/Twist_Solid

Twist_Solid v0.14.0

12 Jun 09:03
6503369
Compare
Choose a tag to compare

Release notes v0.14.0

Panel of Normal for CNV:

New Panel of normal used as default for cnvkit that can be used for all sequencing machines as it combines normal samples from nextseq550, nextseq2000 and novaseq6000. There are still some issues with small FP deletions in some genes, like BRCA1.

Coverage and mutations:

The coverage for all positions marked as region_all are now reported. Before it was only positions with coverage below 300.
Added a second coverage and mutations file for ENC

UMI / Plamsa fusion calling

Use GeneFuse for fusion calling in plasma
Use relaxed filtering for FuseqWES (min 15 supporting reads). We also save the unfiltered outfile in results under additional files.

Changes in config.yaml

filter_fuseq_wes_umi:
  min_support: 15
  filter_on_fusiondb: True
  
gene_fuse:
  container: "docker://hydragenetics/genefuse:0.6.1"
  
report_gene_fuse:
  min_unique_reads: 6

Changes in config.data.hg19.yaml

cnvkit_batch:
  normal_reference: "{{PROJECT_PON_DATA}}/GMS560/PoN/cnvkit_PoN_combined_hg19.cnn"

filter_fuseq_wes_umi:
  gene_fusion_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/false_positive_fusion_pairs.txt"
  gene_white_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_gene_white_list.txt"
  transcript_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_transcript_black_list.txt"
  gtf: "{{PROJECT_REF_DATA}}/ref_data/refGene/hg19.refGene.gtf"

hotspot_report:
  chr_translation_file: "config/reports/hotspot_report.chr.translation.hg19"
  hotspot_mutations: 
    all: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/Hotspots_combined_regions_nodups.231011.csv"
    ENC: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/ENC_hotspots_240604.csv"

Changes in config.data.hg38.yaml

cnvkit_batch:
  normal_reference: "{{PROJECT_PON_DATA}}/GMS560/PoN/cnvkit_PoN_combined_hg38.cnn"

filter_fuseq_wes_umi:
  gene_fusion_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/false_positive_fusion_pairs.txt"
  gene_white_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_gene_white_list.txt"
  transcript_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_transcript_black_list.txt"
  gtf: "{{PROJECT_REF_DATA}}/ref_data/refGene/hg38.refGene.gtf"
  
gene_fuse:
  genes: "/projects/wp1/nobackup/ngs/utveckling/Twist_DNA_DATA/gene_fuse/GMS560_fusion_w_pool2.hg38.csv"
  fasta: "/data/ref_genomes/GRCh38_p14/homo_sapiens.fasta"
  
hotspot_report:
  chr_translation_file: "config/reports/hotspot_report.chr.translation.hg38"
  hotspot_mutations: 
    all: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/Hotspots_combined_regions_nodups.231011_hg38.csv"
    ENC: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/ENC_hotspots_240604_hg38.csv"

Changes in resources

None

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.5.0 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v1.0.1 (No change)
  • filtering: v0.3.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.4.0 (No change)
  • fusions: v0.1.0 (No change)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.4.1 (No change)
  • misc: v0.2.0 (No change)

Snakemake min version

7.18.0 (no change)

Hydra genetics version

3.0.0 (Updated)

Hydra genetics common singularity version

3.0.0 (Updated)

Features

  • add hotspot file for ENC (2de88ed)
  • added blacklist filtering of small cnv deletions (f6e77bc)
  • added ENC hotspot files for hg38 (0e06818)
  • added separate filtering of fuseq wes for umi (8b83894)
  • new PoN on figshare (41f3f0b)
  • new PoN on figshare (f702d7c)
  • new PoN on figshare hg38 (641e37d)
  • reinstated genefuse for umi only (a5af752)
  • update common container to new hydra (6f64d68)
  • update hydra version (651c38f)
  • use latestest version (365d3d2)
  • use new PoN (e5bbd55)
  • use new PoN hg38 (1151f3b)

Bug Fixes

Documentation

Twist_Solid v0.13.0

29 Apr 10:57
7634269
Compare
Choose a tag to compare

Release notes v0.13.0

CNV html report

Large CNVs (> 30% of an chromosomal arm) are now reported in a new table in the html report.
There is a new warning in the new table when there are very few segments on the baseline suggesting that the baseline is shifted.
There is a new warning of polyploidy when there is large regions with BAF -signals that do not match with the copy numbers.

Two new genes reported for CNV amplifications.

The MDM2 gene as well as the C19MC miRNA cluster is now reported when amplications are found in these genes.

Changes in config.yaml

cnv_html_report:
  show_table: true
  show_cytobands: true
  extra_tables:
    - name: Small CNVs and 1p19q
      description: >
        Additional small amplifications and deletions as well as 1p19q co-deletions called by Twist Solid
        in-house scripts. Can have overlaps with called regions from other callers.
      path: "cnv_sv/svdb_query/{sample}_{type}.{tc_method}.cnv_loh_genes_all.cnv_additional_variants_only.tsv"
    - name: Large chromosomal aberrations
      description: >
        Large chromosomal aberrations in the form of deletions, duplications and copy neutral loss of heterozygosity.
        Also warnings of baseline skewness and detection of polyploidy in the sample.
      path: "cnv_sv/svdb_query/{sample}_{type}.{tc_method}.cnv_loh_genes_all.cnv_chromosome_arms.tsv"

cnv_tsv_report:
  amp_cn_limit: 6.0
  baseline_fraction_limit: 0.2
  del_1p19q_cn_limit: 1.4
  del_1p19q_chr_arm_fraction: 0.3
  chr_arm_fraction: 0.3
  del_chr_arm_cn_limit: 1.4
  amp_chr_arm_cn_limit: 2.5
  normal_baf_lower_limit: 0.4
  normal_baf_upper_limit: 0.6
  normal_cn_lower_limit: 1.7
  normal_cn_upper_limit: 2.25
  polyploidy_fraction_limit: 0.2

svdb_merge:
  container: "docker://hydragenetics/svdb:2.6.0"
  tc_method:
    - name: pathology_purecn
      cnv_caller:
        - cnvkit
        - gatk
    - name: purecn
      cnv_caller:
        - cnvkit
        - gatk
    - name: pathology
      cnv_caller:
        - cnvkit
        - gatk
  overlap: 1 #Just merge the two vcf-files without merging regions
  extra: "--pass_only" #Just merge the two vcf-files without merging regions

Changes in config.data.hg19.yaml

annotate_cnv:
  cnv_amp_genes: "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_amp_genes_240307.bed"
  
call_small_cnv_amplifications:
  regions_file: "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_amplification_genes_240307.tsv"
  
cnv_tsv_report:
  chrom_arm_size: "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/chromosome_arm_size.tsv"

merge_cnv_json:
  annotations:
    - "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_amp_genes_240307.bed"
    - "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_loh_genes.bed"

Changes in config.data.hg38.yaml

Several file paths updated

Changes in resources

None

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.5.0 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v1.0.1 (No change)
  • filtering: v0.3.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.4.0 (No change)
  • fusions: v0.1.0 (No change)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.4.1 (No change)
  • misc: v0.2.0 (No change)

Snakemake min version

7.18.0 (no change)

Hydra genetics version

1.14.0 (Updated)

Features

  • added calculation of cnv in chrom arms (667d53f)
  • added chromosome arm support to rule (ccbe735)
  • added configuration of polyploidy and baseline limits (8f91b42)
  • added more config variables (9a92bf2)
  • added table to cnv_html_report (e577cdd)
  • added warnings for baseline and polyploidy (9a25cdc)
  • decreased chr arm fraction (b1ce417)
  • do not merge regions that are not identical (1d9a027)
  • new amp genes for hg38 (c6f7b5e)
  • two new amplification genes reported (c8264b2)
  • update ref files for hg38 (0ad022c)
  • update test file for develop (36bab94)
  • updated with new reference files (5e057e7)

Bug Fixes

  • add "_" to regexp so that indels are reported correctly (7549036)
  • add implied msi configuration from module to config (a9dea4e)
  • bugfix (e681fee)
  • correct missing part of vep ref file path (36b3b0a)
  • handle CNVs bridging the chromosome arms (8a508ad)
  • remove unused library that break requirements (86afc7d)
  • rm duplicate entries of cnv in report (3a2cae4)
  • spelling error (46aef96)
  • update config, merge_cnv_json, to use latest amp gene bed file (be98e22)
  • update hg38 ref files, point to correct release (e8efb8c)
  • variable baf bugfix (d460172)

Documentation

  • added config schemas for new params (62c832d)
  • added schema (b1c9bfe)

Twist_Solid v0.12.0

07 Mar 07:34
7c3a85d
Compare
Choose a tag to compare

Release notes v0.12.0

Version logging

Versions of pipeline, config and softwares used in the analysis are now logged and put into a versions foldor directly under results.
The software versions are obtained by using the inspect function of the signularaties and reports the versions defined by LABEL in the Dockerfile.
The pipeline version is obatined from the github tag/branch.
The combined configfile used in the pipeline is written to one file with added information about software versions for each rule.
The versions info is also gathered and added to MultiQC.

DNA fusions

FuseqWES is now validated with high precision using the new stricter filtering parameters (50 supporting reads). Car should be taken with false positive fusions in amplified genes.
GeneFuse removed as it could not find novel fusions and took long time to run.

Sample mixup test

ID-SNPs is now called also for DNA (before only RNA) using bcftools. For RNA and DNA samples analysed in the same analysis the ID-SNP are compared and the best hit between RNA and DNA is reported in a mixup report. If samples are more than 80% similar it is reported as a match. Unrelated samples are usually around 50% similar while matched samples are typically between 90 and 95% similar.

CNV html report

An extra table is added in the report where 1p19q deletions as well as small deletion and small amplifications candidates are reported.

UMI filter

Slightly updated umi filters. Also updated config for tmb_umi calculations. OBS! TMB is not calibrated for umi yet and will report to high values if not filtered on 5% VAF.

HG38

Reference downloads, designs files and configs updated for hg38.

Changes in config

Changed:

filter_fuseq_wes:
min_support: 50
filter_on_fusiondb: True

tmb_umi: // completely updated

Removed:

gene_fuse:
container: "docker://hydragenetics/genefuse:0.6.1"

report_gene_fuse:
min_unique_reads: 6

scarhrd:
reference_name: "grch37" // moved to hg19 config

vep:
extra: // moved to hg19 config

vep_wo_pick:
extra: // moved to hg19 config

Changes in resources

None

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.5.0 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v1.0.1 (No change)
  • filtering: v0.3.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.4.0 (No change)
  • fusions: v0.1.0 (No change)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.4.1 (Now possible to add extra tables in the report)
  • misc: v0.2.0 (No change)

Snakemake min version

7.18.0 (no change)

Hydra genetics version

1.14.0 (Updated)

Features

  • adapted configs to hg38 (7c601ab)
  • added new rule for rna dna mixup check (164baa6)
  • cnv html report additional table only includes additional calls (e3971eb)
  • data config for HG38 (missing HRD, SVDB, purecn) (7376033)
  • increased read support needed for fuseq_wes (f669bc2)
  • print pipeline and software version and config to results (d643a39)
  • put mixup report directly under result as it is both rna and dna (c9cdbc5)
  • removed gene fuse from config (c1f25ba)
  • removed gene fuse from pipeline (83260f6)
  • update hydra-genetics version (c57d3f3)
  • update reports module to v0.4.1 (#398) (32e40e9)

Bug Fixes

  • avoid clash with earlier filter annotation (9d2ed1b)
  • bug in Snakefile (d8307f4)
  • bugfix (a154b11)
  • bugfix (75d21be)
  • bugfix and improvements (dc7f4b4)
  • changed sample mixup från txt to tsv (5b91e60)
  • correct paths for CNV report extra tables (#401) (37a461e)
  • extra table without tag to cnv html report (765ce51)
  • handle missing files better (a0be997)
  • make umi tmb similar to tmb (c525c0e)
  • min 1% AF (ef2bcaf)
  • missing vep config in integration (e8e8a57)
  • new smart_open release crash (03e0a0e)
  • pycodestyle and unittest update (79cf8f8)
  • required variable with corrected indentation (115e018)
  • resolve conflicts (6343bd2)
  • rm dangerous qual filter form umi (c9d4584)
  • rm dangerous qual filter form umi (d20f0cc)
  • some hg19 file leftovers fixed (5433b16)
  • VEP cache move to config.data (a36b570)

Documentation

  • added missing rule (a6712f6)
  • fix documentation and codestyle (7d793b6)
  • rm gene_fuse from schemas and updated documentation (a991a0d)
  • rule plugin version updated (2d63ac6)
  • updated rule documentation (43cfb5c)

Twist_Solid v0.11.0

26 Jan 13:33
aa2220e
Compare
Choose a tag to compare

Release notes v0.11.0

New result files folder structure

The results are now structured based on sample so that all results for one sample are found together like this:
result/dna/sample/analysis_type/result_files
result/rna/sample/analysis_type/result_files
Also, TMB, MSI, and HRD results files are now found in the biomarker folder.

Improved RNA fusion report

The RNA fusion report now puts calls with the same breakpoint from the same caller on the same line. It also sorts the report based on total read support, showing the highest supported fusions at the top.
The report now also shows the deduplicated coverage for the breakpoint (the highest coverage in the fusion pair). This can help to determine if a fusion with low support is a FP found due to high coverage (high expression).

Speed optimization of fusioncatcher

Removed unused option in fusioncatcher that was slowing down calculations (removed --visualization-sam)

CNV html report

The CNV report now optionally shows the copy number data points as error bars instead of data points making it easier to analyse the data, especially in high density regions such as exons.
The VAF-plots now show data down to 20 supporting reads (down from 50). This will increase the number of VAF-points slighltly and will avoid having no data in deleted regions for samples with high tumor content.

UMI filter

Completely new UMI-filter based on tests run on the reference samples
config/filters/config_hard_filter_umi_vep105.yaml
config/filters/config_soft_filter_umi_vep105.yaml

Changes in output files

See above

Changes in config

Changed:

default_container: "docker://hydragenetics/common:1.10.2"

cnv_html_report:  
  show_table: true
  show_cytobands: true
  
fusioncatcher:
  container: "docker://hydragenetics/fusioncatcher:1.33"
  extra: ""

Added:

filter_vcf:
  snv_soft_filter_umi: "config/filters/config_soft_filter_umi_vep105.yaml"
  
vep_wo_pick:
  mode: --offline --cache --refseq
  extra: " --assembly GRCh37 --check_existing --sift b --polyphen b --ccds --uniprot --hgvs --symbol --numbers --domains --regulatory --canonical --protein --biotype --uniprot --tsl --appris --gene_phenotype --af --af_1kg --af_gnomad --max_af --pubmed --variant_class "

Changes in resources

None

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.5.0 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v1.0.1 (No change)
  • filtering: v0.3.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.4.0 (No change)
  • fusions: v0.1.0 (First official release of module)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.3.1 (Updated CNV html report visualization)
  • misc: v0.2.0 (No change)

Snakemake min version

7.18.0 (no change)

Hydra genetics docker version

1.10.2 (new)

Features

  • add amino acid change to coverage_and_mutations (a66f501)
  • added dedup coverage (01046e6)
  • added umi vcf filters (dd1cbe2)
  • decrease germline support needed to improve cnv visualization of deletions (c6cd922)
  • decrease support needed for germline SNVs (efb2827)
  • restructured rna fusion report for improved interpretation (373021c)
  • update Jenkins main test (5b30a35)
  • update reports module to 0.3.1 (#389) (9346b03)
  • update to latest common container (eef00aa)
  • updated hydra-genetics version (b7eadaa)

Bug Fixes

  • added missing double mutations to report (ed1a526)
  • bugfix (e9f2511)
  • bugfix (2dbc810)
  • bugfixes (7fcd240)
  • changed sorting and bugfixes (355d99e)
  • corrected filepath for rna in develop (85abf47)
  • corrected input file for hotspot_report.smk (8ce6512)
  • corrected input variables for hotspot_report.py (076f5fd)
  • corrected input variables for hotspot_report.smk (6758fa4)
  • corrected results path for soft-filtered umi vcf (a8ed44c)
  • delete extra file (2803f67)
  • housekeeping dict (7105a88)
  • input file bugfix (bfa94a6)
  • pulp version (f37c08a)
  • remove extra header columns (6bb5d42)
  • remove extra sample folder in output (39fefd8)
  • remove newline after purity in cnvkit_call command (b820ab5)
  • remove unused time consuming step in fusioncatcher (635a40b)
  • removed spaces in jenkins test develop (05eb8ef)
  • reverse sorting and add chr to fusioncatcher (794f8f7)
  • update hydra-genetics to v1.10.1 (be2682d)
  • update jenkins start scripts (8cdaf1f)
  • update profile with mounted home (7665c8b)
  • update profile with mounted home (0b1be40)
  • update profile with mounted home (1947e3a)
  • update profile with mounted home (b3d56e2)
  • updated jenkins output for develop (887584b)
  • updated to bugfixed common container (1449abd)

Twist_Solid v0.10.0

01 Dec 13:12
f22977b
Compare
Choose a tag to compare

Main new features in v0.10.0

  • Scripts, configs and documentation for downloading all reference files used by the pipeline
  • Restructuring of the pipeline config files

Reference download

All files used by the pipeline can now be downloaded us the hydra-genetics command line tool.

Config restructuring

The config folder has now sub folders to avoid file cluttering.
The main config files have been split up into two configs, one for data files and one for software settings. The file paths in the data config are generalized so that only three different variables are needed to be adjusted, especially if the reference files are downloaded with the reference download tool.

TERT

2 new TERT promoter positions have been granted hotspot status

Fuseq-WES

Filtering of artifact fusion pairs can now be specified in a text file specified in the config.

RNA fusion report

Filtering of false positive fusions and cut-off values are now configurable (for Star-fusion and FusionCatcher) using a text file specified in config. Increased the list of fusions that are filtered from previous release based on more analysed samples.

Features

  • 2 new TERT positions added to report (3617acb)
  • add blacklist for fuseq_wes filtering (6bc03e3)
  • make fusion filtering configurable and add more fp fusions (5ed5c97)
  • more conda updates (df2b77c)
  • report internal callers for fusioncatcher (d5fccad)
  • setup files for reference file validation and fetching using yaml files (7078c04)
  • singularity build script update (8a8b665)
  • update to latest files used by config (07e351c)
  • update to newer hotspot file and clean config (72ef4ff)
  • use v0.1.0 of fusion module to filter FP Fuseq_WES calls (3c30fb1)

Bug Fixes

  • add lambda to functions so that they are only valuated when used (0b85dda)
  • add missing extra option in star-fusion (7476a0d)
  • add missing type (35e5f6e)
  • change from svdb_query file to svdb_merge file (1ed29ad)
  • configs for reference profile (6262775)
  • correct path in config (dd63afa)
  • handle reference creation with unit file containing both T and N (61b7764)
  • increased star_fusions wall time (5ff15de)
  • lamda in correct position (128aa8f)
  • minor config updates (8296226)
  • new rna fusion filter file without duplicate entry and header (280fd4a)
  • pycodestyle (36eed32)
  • support header with fusions filtering (c8a8bb7)
  • updarw config files to match current setup (0e689db)
  • update configs and restructure (7f397de)

Documentation

  • update documentation (0fd09ea)
  • update reference page with instructions to download reference files (3abac1f)

Twist_Solid v0.9.0

02 Nov 08:56
c2ba7b1
Compare
Choose a tag to compare

Release notes v0.9.0

UMI

Possibility to analyse the data using UMIs. UMI analysis for all or selected samples can be achive by adding an extra column in samples.tsv named deduplication. In this column all samples marked with umi will be run using UMIs. Some results will be additionally be created without using UMIs (bam, vcf, MSI, TMB, MultiQC).
UMI read consensus and filtering are performed by Fgbio.

Reference creation pipeline

Use the coming release v0.10.0 for this.

Changes in output files

Added UMI output files
Modified files as adotion to reworked annotation modules
New reference output files need update, please use next version (v0.10.0)

Changes in config

Added config file for GRCh38
Reworked config_references.yaml
Reworked config_references_hg38.yaml

Added in hg19 config (VEP and UMI-changes):

vep:
 mode: --offline --cache --refseq

bwa_mem_realign_consensus_reads:
 container: "docker://hydragenetics/fgbio:2.1.0"

fgbio_call_and_filter_consensus_reads:
 container: "docker://hydragenetics/fgbio:2.1.0"
 max_base_error_rate: "0.2"
 min_reads_call: "1 0 0"
 min_reads_filter: "1 0 0"
 min_input_base_quality_call: 20
 min_input_base_quality_filter: 30

fgbio_copy_umi_from_read_name:
 container: "docker://hydragenetics/fgbio:2.1.0"

fgbio_group_reads_by_umi:
 container: "docker://hydragenetics/fgbio:2.1.0"
 umi_strategy: paired

filter_vcf:
 snv_hard_filter_umi: "config/config_hard_filter_umi.yaml"
 snv_soft_filter_umi: "config/config_soft_filter_umi.yaml"

multiqc:
 container: "docker://hydragenetics/multiqc:1.11"
 reports:
  DNA:
   deduplication: ["mark_duplicates"]
  DNA_umi:
   config: "config/multiqc_config_dna.yaml"
   included_unit_types: ["N", "T"]
   deduplication: ["umi"]
   qc_files:
     - "qc/fastqc/{sample}{type}{flowcell}{lane}{barcode}fastq1_fastqc.zip"
     - "qc/fastqc/{sample}
{type}{flowcell}{lane}{barcode}fastq2_fastqc.zip"
     - "qc/picard_collect_alignment_summary_metrics/{sample}
{type}.alignment_summary_metrics.txt"
     - "qc/picard_collect_duplication_metrics/{sample}
{type}.duplication_metrics.txt"
     - "qc/picard_collect_hs_metrics/{sample}{type}.HsMetrics.txt"
     - "qc/picard_collect_insert_size_metrics/{sample}
{type}.insert_size_metrics.txt"
     - "qc/samtools_stats/{sample}{type}.samtools-stats.txt"
     - "qc/gatk_calculate_contamination/{sample}
{type}.contamination.table"
     - "alignment/fgbio_group_reads_by_umi/{sample}_{type}.umi.histo.tsv"

samtools_merge_bam_umi:
 extra: "-c -p"

tmb_umi:
 af_lower_limit: 0.000
 af_upper_limit: 0.994
 af_germline_lower_limit: 0.3
 af_germline_upper_limit: 0.7
 artifacts: ""
 background_panel: ""
 db1000g_limit: 0.0001
 dp_limit: 300
 filter_genes: "/projects/wp1/nobackup/ngs/utveckling/Twist_DNA_DATA/tmb_filter_genes.txt"
 gnomad_limit: 0.0001
 vd_limit: 0
 nr_avg_germline_snvs: 0.0
 nssnv_tmb_correction: 0.84

vardict:
 allele_frequency_threshold_umi: "0.001"

Changes in resources (UMI + reference creation)

New reference resource file

Added UMI to ordinary resources file:
bwa_mem_realign_consensus_reads:
 mem_mb: 61440
 mem_per_cpu: 6144
 threads: 10
 time: "8:00:00"

fgbio_call_and_filter_consensus_reads:
 threads: 3
 mem_mb: 18432
 mem_per_cpu: 6144
 time: "8:00:00"

fgbio_group_reads_by_umi:
 time: "8:00:00"

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.5.0 (Adding support for UMIs using Fgbio)
  • snv_indels: v0.3.0 (No change)
  • annotation: v1.0.1 (Annotation order not fixed in module. Update pipeline to reflext this)
  • filtering: v0.3.0 (Additional filtering features not affecting the pipeline)
  • qc: v0.3.0 (No change)
  • biomarker: v0.4.0 (No change)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.2.0 (No change)

Snakemake min version

7.18.0

Features

  • 2 new TERT promoter variants (ee72af1)
  • add umi support for cnvkit (979c816)
  • add umi support for fuseq wes (a4d7139)
  • add umi support for fuseq wes (522ac37)
  • add umi support for gatkcnv (a2ce975)
  • add umi support for hrd and msi (20560a2)
  • add umi support for manta (8a703d9)
  • add umi support for qc (89f673b)
  • add umi support for tmb (9683d6b)
  • added tmb gene filter. TMB uses hard filtered file for input (9e242a3)
  • added umi choice to the pipeline (8e13b38)
  • changes hard filtering to let more variants through (70606f3)
  • hard filter for qci (9ae0a2f)
  • min vaf configurable in vardict (303b5f7)
  • rm need for ruleorder for copy rules using global wildcard constraints (ad15f6c)
  • run msi w and wo umi (6766b18)
  • umi in rna (6ccba4c)
  • umi vcf filtering based on sample.tsv (4fdd999)
  • update alignment module tag (ed33d45)
  • update to v1.0.0 relaese of annotation (3fd0c7b)
  • Update workflow/Snakefile with new alignment tag (e74398e)
  • updated alignment module tag (62500dd)
  • updated alignment module tag (341a3ba)
  • updated module versions and adapted to these new versions (c4aa705)

Bug Fixes

  • adapt to new umi alignment module (60a90b3)
  • adopt to breaking change in vep rule (96b4b26)
  • bugfix in get_vardict_min_af (cd503b7)
  • bugfix in get_vardict_min_af (44d4b59)
  • corrected gatk_mutect2 input files (442d62c)
  • fix manta output files and rm unneaded umi rules in Snakefile (5ec7a33)
  • fixed units.tsv and adaption of reference pipeline to new annotation module (9d0c926)
  • gvcfs now have mosdepth umi coverage (cf3e9c6)
  • manta to use original name, filtering wo <=, msi-sensor w correct output name (0a9960b)
  • match output result from reference module (e2415d5)
  • mosdepth should be run on entire bam file ([8643bb2](https://www.github.com/genomic-medicine-sweden/Twist...
Read more

Twist_Solid v0.8.0

22 Sep 12:35
0cff014
Compare
Choose a tag to compare

Release notes

Bugfixes

Missing bam index files are now retained in results when running the analysis without --notemp
Missing input file for cnv-html-report added so that the report works when running the analysis without --notemp

Features

TMB

When running on novaseq and nextseq2000, artifacts in CDC27 and MUC6 are commonly not filtered out, inflating the TMB-numbers. A new filtering option are therefore added where these two genes are completely filtered out. This does not affect the correlation or slope which was calculated using samples run on nextseq550.

Arriba

Arribas performance is, in our data, better on 100bp than on 150bp reads. Therefore the reads are trimmed with fastp to 100bp before calling fusions with Arriba. All other fusion callers are unaffected.

Changes in config

Added:

TMB:
filter_genes: "{path}/tmb_filter_genes.txt"

Changes in resources

fastp_pe_arriba:

  • threads: 5
  • mem_mb: 30720
  • mem_per_cpu: 6144

Documenation (read the docs)

Updated links to hydra modules with documantion.
Updated files to use for ference creation

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.3.1 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v0.3.0 (No change)
  • filtering: v0.1.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.4.0 (Gene filtering)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.2.0 (Bugfix: missing input file, and some improvements)

Features

  • tmb two noisy genes filtered (45397d5)
  • trim reads to 100bp for improved results in Arriba (aa06413)
  • update biomarker module version (d9f0059)
  • update reports module to v0.2.0 (3f9be85)

Bug Fixes

  • added modified input function to cnv_html_report to fix notemp bug (b535c42)
  • keep bai files (753b789)
  • merge two rule definitions into one (e294971)
  • update snakemake version to avoid checkpoint restart job bug (67f1bf6)

Documentation

hydra-genetics v0.7.0

18 Aug 06:37
103ea53
Compare
Choose a tag to compare

Release notes

For more details on features and bug fixes see further down.

Features

CNV Filtering

Filtering using frequency in sample database removed true whole chromosome deletions. Therefore the frequency filtering is now only done on segments that are smaller than 10Mb.

CNV Reporting

The CNV html report is moved out from the pipeline and moved into a new reporting module in Hydra-genetics. The report itself are not changed except for minor improvements.

CNV PureCN

PureCN is now producing reliable results. However, it does not handle low TC samples and samples with few copy number abberations. Both of these will in general get underestimated TC in the range of 15-32%. Therefore a new combined CNV html and tsv report is now generated that uses the PureCN TC if it is above 35% and the pathology estimated TC otherwise. These files have the tag pathology_purecn in the results. The results for pathology and pureCN only are placed in additional files under results.
PureCN uses a new filter (snv_hard_filter_purecn) to get its vcf input file.

RNA MultiQC

The bam-files produced by the Star aligner are now duplicate marked by picard. This is only used for for QC and the duplication rate is reported in the RNA MultiQC report.

Changes in config.yaml

Changed:

  • output: "config/output_list.json" => output: "config/output_files.yaml" #New output file format
  • cnv_html_report:
    show_table: true
    template_dir: config/cnv_report_template
  • design_intervals_rna: "/projects/wp1/nobackup/ngs/utveckling/Twist_RNA_DATA/bed/Twist_RNA_Design5.annotated.20230630.interval_list" #New file for duplication QC
  • report_fusions: #Corrected spelling of fusioncatcher, only changes shown
    fusioncatcher_flag_low_support: 15
    fusioncatcher_low_support: 3
    fusioncatcher_low_support_fp_genes: 20
    fusioncatcher_low_support_inframe: 6

Added:

  • merge_cnv_json:
    annotations:
    - /references/cnv_amp_genes.bed
    - /references/cnv_loh_genes.bed
    filtered_cnv_vcfs:
    - cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_amp_genes.filter.cnv_hard_filter_amp.vcf.gz
    - cnv_sv/svdb_query/{sample}
    {type}.{tc_method}.svdb_query.annotate_cnv.cnv_loh_genes_all.filter.cnv_hard_filter_loh.vcf.gz
    unfiltered_cnv_vcfs:
    - cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_amp_genes.vcf.gz
    - cnv_sv/svdb_query/{sample}
    {type}.{tc_method}.svdb_query.annotate_cnv.cnv_loh_genes_all.vcf.gz
    germline_vcf: snv_indels/bcbio_variation_recall_ensemble/{sample}_{type}.ensembled.vep_annotated.filter.germline.exclude.blacklist.vcf.gz
  • filter_vcf:
    snv_hard_filter_purecn: "config/config_hard_filter_purecn.yaml" #New filter for purecn
  • svdb_merge:
    tc_method:
    - name: pathology_purecn #new combined tag
    cnv_caller:
    - cnvkit
    - gatk

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.3.1 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v0.3.0 (No change)
  • filtering: v0.1.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.3.1 (No change)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.1.0 (New module, CNV html report moved here)

Features

  • add result file for combined purecn and pathology (fb17cc2)
  • added duplication % to multiQC (2923ebf)
  • added picard mark duplicates of bam-files for QC (0f33656)
  • added read group function for STAR (c224c17)
  • added RG to STAR and changed bam file for QC (1f68585)
  • added rule for modifying MBQ in vcf (72dc309)
  • added rule for modifying MBQ in vcf (6c26626)
  • added rule for modifying MBQ in vcf (188f418)
  • change pureCN cutoff to 0.35 (4e06643)
  • choose purecn if tc > 30% and pathology otherwise (20292ab)
  • harder filtering (7cae319)
  • make two tsv reports using different gene lists (0737c60)
  • test_input_all.tsv for v0.7.0 (27df350)
  • test_input_VAL2022.tsv for v0.7.0 (26f5157)
  • use filtered vcf with both germline and somatic variants (f6c5cc3)
  • use gatk2 for purecn (ee2e2bc)
  • use germline vcf for purecn (eb0eaf5)
  • use purity file directly from purecn to also get ploidity (b397a3c)
  • use vaf and snv filtered vcf with both germline and somatic variants (48f3563)

Bug Fixes

  • add germline flag to vcf (1e8de1b)
  • add missing filter tag (7c01e2d)
  • annotate using missing sites instead (de172b0)
  • bug fixes (224a6a4)
  • change checkpoint to rule (5e04c9a)
  • change path to new normals (7a3b420)
  • correct header in cnv report file (c947152)
  • correct output name for purecn reference (92af141)
  • correct rule import from wrong module (275a60d)
  • delegate schema validation to reports module (b5dada0)
  • do not filter large cnvs based on frequency in database (4b86637)
  • get correct tc to html report (c394cba)
  • handle empty purecn file (df89fe5)
  • import spelling mistake (f3e7248)
  • moved result file to additional files (c6c7574)
  • properly overrule the get_tc function (701ceb8)
  • purecn_modify_vcf bugfix (e910e18)
  • redefine rule to use new params in config (5ef5dac)
  • return correct tc (45485c7)
  • solve different wildcards in rule error (62082a1)
  • spelling error of Exception (8d6fbf9)
  • tabix of annotation database (1f70635)
  • use correct genome (66e3c8b)
  • use correct get_tc (...
Read more

hydra-genetics v0.6.1

28 Apr 15:10
4cef046
Compare
Choose a tag to compare

Release notes

For more details on features and bug fixes see further down.

Features

###Documentation
First relase of read the docs (https://twist-solid.readthedocs.io/en/v0.6.1/)

CNV

  • New small cnv amplification caller (in house script) for the 20 relevant amplification genes which is very similar to the small deletion caller.
  • The CNV.tsv report now also includes amplifications called by the small amplification caller

Bugfixes

  • New lines added to variants in the TMB report

Changes in config.yaml

  • Small amplification caller: new in config
  • CNV tsv report: added option for small amplification filtering

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.3.1 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v0.3.0 (No change)
  • filtering: v0.1.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.3.1 (fixed TMB report bug)
  • cnv_sv: v0.3.1 (No change)

Features

  • added caller for small amplifications (61181d5)
  • update fusions and biomarker tags (9dfadef)

hydra-genetics v0.5.0

21 Apr 08:12
b2bbf59
Compare
Choose a tag to compare

Release notes

For more details on features and bug fixes see further down.

Features

CNV

  • The CNV.hmtl report now reports TC content
  • The CNV.hmtl report now reports VAF values in the variant table
  • The CNV.tsv report now also includes deletions called by the small deletions caller

TMB

  • Improved TMB-calculations. Finds more true variants and have better correlation compared to TSO500. Does not use any panel of normals anymore making the calculations more independent on sequencing platform.

DNA fusions

  • Added DNA fusion calling using Fuseq-WES with superior results compared to GeneFuse
  • GeneFuse: Added filtering of the ERG gene

RNA exon skipping

  • Only report MET exon 14 skipping and EGFRvIII and not other potential skipping events in these genes

Bugfixes

  • Copy .bai file with timestamp instead of creating it so that it is not removed by snakemake

Changes in config.yaml

  • TMB: new and updated config options for tmb rule
  • FuseqWES: Added config for fuseq_wes rule
  • FuseqWES filtering: Added config for filter_fuseq_wes rule

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.3.1 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v0.3.0 (No change)
  • filtering: v0.1.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.3.0 (TMB updated with more config options)
  • cnv_sv: v0.3.1 (No change)