Releases: genomic-medicine-sweden/Twist_Solid
Twist_Solid v0.14.0
Release notes v0.14.0
Panel of Normal for CNV:
New Panel of normal used as default for cnvkit that can be used for all sequencing machines as it combines normal samples from nextseq550, nextseq2000 and novaseq6000. There are still some issues with small FP deletions in some genes, like BRCA1.
Coverage and mutations:
The coverage for all positions marked as region_all are now reported. Before it was only positions with coverage below 300.
Added a second coverage and mutations file for ENC
UMI / Plamsa fusion calling
Use GeneFuse for fusion calling in plasma
Use relaxed filtering for FuseqWES (min 15 supporting reads). We also save the unfiltered outfile in results under additional files.
Changes in config.yaml
filter_fuseq_wes_umi:
min_support: 15
filter_on_fusiondb: True
gene_fuse:
container: "docker://hydragenetics/genefuse:0.6.1"
report_gene_fuse:
min_unique_reads: 6
Changes in config.data.hg19.yaml
cnvkit_batch:
normal_reference: "{{PROJECT_PON_DATA}}/GMS560/PoN/cnvkit_PoN_combined_hg19.cnn"
filter_fuseq_wes_umi:
gene_fusion_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/false_positive_fusion_pairs.txt"
gene_white_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_gene_white_list.txt"
transcript_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_transcript_black_list.txt"
gtf: "{{PROJECT_REF_DATA}}/ref_data/refGene/hg19.refGene.gtf"
hotspot_report:
chr_translation_file: "config/reports/hotspot_report.chr.translation.hg19"
hotspot_mutations:
all: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/Hotspots_combined_regions_nodups.231011.csv"
ENC: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/ENC_hotspots_240604.csv"
Changes in config.data.hg38.yaml
cnvkit_batch:
normal_reference: "{{PROJECT_PON_DATA}}/GMS560/PoN/cnvkit_PoN_combined_hg38.cnn"
filter_fuseq_wes_umi:
gene_fusion_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/false_positive_fusion_pairs.txt"
gene_white_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_gene_white_list.txt"
transcript_black_list: "{{PROJECT_DESIGN_DATA}}/GMS560/fuseq_wes/fuseq_wes_transcript_black_list.txt"
gtf: "{{PROJECT_REF_DATA}}/ref_data/refGene/hg38.refGene.gtf"
gene_fuse:
genes: "/projects/wp1/nobackup/ngs/utveckling/Twist_DNA_DATA/gene_fuse/GMS560_fusion_w_pool2.hg38.csv"
fasta: "/data/ref_genomes/GRCh38_p14/homo_sapiens.fasta"
hotspot_report:
chr_translation_file: "config/reports/hotspot_report.chr.translation.hg38"
hotspot_mutations:
all: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/Hotspots_combined_regions_nodups.231011_hg38.csv"
ENC: "{{PROJECT_DESIGN_DATA}}/GMS560/reports/ENC_hotspots_240604_hg38.csv"
Changes in resources
None
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.5.0 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v1.0.1 (No change)
- filtering: v0.3.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.4.0 (No change)
- fusions: v0.1.0 (No change)
- cnv_sv: v0.3.1 (No change)
- reports: v0.4.1 (No change)
- misc: v0.2.0 (No change)
Snakemake min version
7.18.0 (no change)
Hydra genetics version
3.0.0 (Updated)
Hydra genetics common singularity version
3.0.0 (Updated)
Features
- add hotspot file for ENC (2de88ed)
- added blacklist filtering of small cnv deletions (f6e77bc)
- added ENC hotspot files for hg38 (0e06818)
- added separate filtering of fuseq wes for umi (8b83894)
- new PoN on figshare (41f3f0b)
- new PoN on figshare (f702d7c)
- new PoN on figshare hg38 (641e37d)
- reinstated genefuse for umi only (a5af752)
- update common container to new hydra (6f64d68)
- update hydra version (651c38f)
- use latestest version (365d3d2)
- use new PoN (e5bbd55)
- use new PoN hg38 (1151f3b)
Bug Fixes
- add missing singularity (24eeac6)
- added blacklist files to reference config (3450166)
- config schema (e727bd5)
- config schema (9470d56)
- correct reference files (4ff7a76)
- correct URL (ee7c8f6)
- corrected config (20fb88e)
- corrected novaseq hrd PoN download link (6ea893a)
- git version (4efd11f)
- git version (ef0891d)
- rm conda (210dfdc)
- update ENC hotspots (fec13d5)
- update ENC hotspots (8334f7a)
- update ENC hotspots (66dd017)
- update ENC hotspots (1338508)
- wrong release of design files (cc246f7)
Documentation
Twist_Solid v0.13.0
Release notes v0.13.0
CNV html report
Large CNVs (> 30% of an chromosomal arm) are now reported in a new table in the html report.
There is a new warning in the new table when there are very few segments on the baseline suggesting that the baseline is shifted.
There is a new warning of polyploidy when there is large regions with BAF -signals that do not match with the copy numbers.
Two new genes reported for CNV amplifications.
The MDM2 gene as well as the C19MC miRNA cluster is now reported when amplications are found in these genes.
Changes in config.yaml
cnv_html_report:
show_table: true
show_cytobands: true
extra_tables:
- name: Small CNVs and 1p19q
description: >
Additional small amplifications and deletions as well as 1p19q co-deletions called by Twist Solid
in-house scripts. Can have overlaps with called regions from other callers.
path: "cnv_sv/svdb_query/{sample}_{type}.{tc_method}.cnv_loh_genes_all.cnv_additional_variants_only.tsv"
- name: Large chromosomal aberrations
description: >
Large chromosomal aberrations in the form of deletions, duplications and copy neutral loss of heterozygosity.
Also warnings of baseline skewness and detection of polyploidy in the sample.
path: "cnv_sv/svdb_query/{sample}_{type}.{tc_method}.cnv_loh_genes_all.cnv_chromosome_arms.tsv"
cnv_tsv_report:
amp_cn_limit: 6.0
baseline_fraction_limit: 0.2
del_1p19q_cn_limit: 1.4
del_1p19q_chr_arm_fraction: 0.3
chr_arm_fraction: 0.3
del_chr_arm_cn_limit: 1.4
amp_chr_arm_cn_limit: 2.5
normal_baf_lower_limit: 0.4
normal_baf_upper_limit: 0.6
normal_cn_lower_limit: 1.7
normal_cn_upper_limit: 2.25
polyploidy_fraction_limit: 0.2
svdb_merge:
container: "docker://hydragenetics/svdb:2.6.0"
tc_method:
- name: pathology_purecn
cnv_caller:
- cnvkit
- gatk
- name: purecn
cnv_caller:
- cnvkit
- gatk
- name: pathology
cnv_caller:
- cnvkit
- gatk
overlap: 1 #Just merge the two vcf-files without merging regions
extra: "--pass_only" #Just merge the two vcf-files without merging regions
Changes in config.data.hg19.yaml
annotate_cnv:
cnv_amp_genes: "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_amp_genes_240307.bed"
call_small_cnv_amplifications:
regions_file: "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_amplification_genes_240307.tsv"
cnv_tsv_report:
chrom_arm_size: "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/chromosome_arm_size.tsv"
merge_cnv_json:
annotations:
- "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_amp_genes_240307.bed"
- "{{PROJECT_DESIGN_DATA}}/GMS560/cnv/cnv_loh_genes.bed"
Changes in config.data.hg38.yaml
Several file paths updated
Changes in resources
None
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.5.0 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v1.0.1 (No change)
- filtering: v0.3.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.4.0 (No change)
- fusions: v0.1.0 (No change)
- cnv_sv: v0.3.1 (No change)
- reports: v0.4.1 (No change)
- misc: v0.2.0 (No change)
Snakemake min version
7.18.0 (no change)
Hydra genetics version
1.14.0 (Updated)
Features
- added calculation of cnv in chrom arms (667d53f)
- added chromosome arm support to rule (ccbe735)
- added configuration of polyploidy and baseline limits (8f91b42)
- added more config variables (9a92bf2)
- added table to cnv_html_report (e577cdd)
- added warnings for baseline and polyploidy (9a25cdc)
- decreased chr arm fraction (b1ce417)
- do not merge regions that are not identical (1d9a027)
- new amp genes for hg38 (c6f7b5e)
- two new amplification genes reported (c8264b2)
- update ref files for hg38 (0ad022c)
- update test file for develop (36bab94)
- updated with new reference files (5e057e7)
Bug Fixes
- add "_" to regexp so that indels are reported correctly (7549036)
- add implied msi configuration from module to config (a9dea4e)
- bugfix (e681fee)
- correct missing part of vep ref file path (36b3b0a)
- handle CNVs bridging the chromosome arms (8a508ad)
- remove unused library that break requirements (86afc7d)
- rm duplicate entries of cnv in report (3a2cae4)
- spelling error (46aef96)
- update config, merge_cnv_json, to use latest amp gene bed file (be98e22)
- update hg38 ref files, point to correct release (e8efb8c)
- variable baf bugfix (d460172)
Documentation
Twist_Solid v0.12.0
Release notes v0.12.0
Version logging
Versions of pipeline, config and softwares used in the analysis are now logged and put into a versions foldor directly under results.
The software versions are obtained by using the inspect function of the signularaties and reports the versions defined by LABEL in the Dockerfile.
The pipeline version is obatined from the github tag/branch.
The combined configfile used in the pipeline is written to one file with added information about software versions for each rule.
The versions info is also gathered and added to MultiQC.
DNA fusions
FuseqWES is now validated with high precision using the new stricter filtering parameters (50 supporting reads). Car should be taken with false positive fusions in amplified genes.
GeneFuse removed as it could not find novel fusions and took long time to run.
Sample mixup test
ID-SNPs is now called also for DNA (before only RNA) using bcftools. For RNA and DNA samples analysed in the same analysis the ID-SNP are compared and the best hit between RNA and DNA is reported in a mixup report. If samples are more than 80% similar it is reported as a match. Unrelated samples are usually around 50% similar while matched samples are typically between 90 and 95% similar.
CNV html report
An extra table is added in the report where 1p19q deletions as well as small deletion and small amplifications candidates are reported.
UMI filter
Slightly updated umi filters. Also updated config for tmb_umi calculations. OBS! TMB is not calibrated for umi yet and will report to high values if not filtered on 5% VAF.
HG38
Reference downloads, designs files and configs updated for hg38.
Changes in config
Changed:
filter_fuseq_wes:
min_support: 50
filter_on_fusiondb: True
tmb_umi: // completely updated
Removed:
gene_fuse:
container: "docker://hydragenetics/genefuse:0.6.1"
report_gene_fuse:
min_unique_reads: 6
scarhrd:
reference_name: "grch37" // moved to hg19 config
vep:
extra: // moved to hg19 config
vep_wo_pick:
extra: // moved to hg19 config
Changes in resources
None
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.5.0 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v1.0.1 (No change)
- filtering: v0.3.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.4.0 (No change)
- fusions: v0.1.0 (No change)
- cnv_sv: v0.3.1 (No change)
- reports: v0.4.1 (Now possible to add extra tables in the report)
- misc: v0.2.0 (No change)
Snakemake min version
7.18.0 (no change)
Hydra genetics version
1.14.0 (Updated)
Features
- adapted configs to hg38 (7c601ab)
- added new rule for rna dna mixup check (164baa6)
- cnv html report additional table only includes additional calls (e3971eb)
- data config for HG38 (missing HRD, SVDB, purecn) (7376033)
- increased read support needed for fuseq_wes (f669bc2)
- print pipeline and software version and config to results (d643a39)
- put mixup report directly under result as it is both rna and dna (c9cdbc5)
- removed gene fuse from config (c1f25ba)
- removed gene fuse from pipeline (83260f6)
- update hydra-genetics version (c57d3f3)
- update reports module to v0.4.1 (#398) (32e40e9)
Bug Fixes
- avoid clash with earlier filter annotation (9d2ed1b)
- bug in Snakefile (d8307f4)
- bugfix (a154b11)
- bugfix (75d21be)
- bugfix and improvements (dc7f4b4)
- changed sample mixup från txt to tsv (5b91e60)
- correct paths for CNV report extra tables (#401) (37a461e)
- extra table without tag to cnv html report (765ce51)
- handle missing files better (a0be997)
- make umi tmb similar to tmb (c525c0e)
- min 1% AF (ef2bcaf)
- missing vep config in integration (e8e8a57)
- new smart_open release crash (03e0a0e)
- pycodestyle and unittest update (79cf8f8)
- required variable with corrected indentation (115e018)
- resolve conflicts (6343bd2)
- rm dangerous qual filter form umi (c9d4584)
- rm dangerous qual filter form umi (d20f0cc)
- some hg19 file leftovers fixed (5433b16)
- VEP cache move to config.data (a36b570)
Documentation
Twist_Solid v0.11.0
Release notes v0.11.0
New result files folder structure
The results are now structured based on sample so that all results for one sample are found together like this:
result/dna/sample/analysis_type/result_files
result/rna/sample/analysis_type/result_files
Also, TMB, MSI, and HRD results files are now found in the biomarker folder.
Improved RNA fusion report
The RNA fusion report now puts calls with the same breakpoint from the same caller on the same line. It also sorts the report based on total read support, showing the highest supported fusions at the top.
The report now also shows the deduplicated coverage for the breakpoint (the highest coverage in the fusion pair). This can help to determine if a fusion with low support is a FP found due to high coverage (high expression).
Speed optimization of fusioncatcher
Removed unused option in fusioncatcher that was slowing down calculations (removed --visualization-sam)
CNV html report
The CNV report now optionally shows the copy number data points as error bars instead of data points making it easier to analyse the data, especially in high density regions such as exons.
The VAF-plots now show data down to 20 supporting reads (down from 50). This will increase the number of VAF-points slighltly and will avoid having no data in deleted regions for samples with high tumor content.
UMI filter
Completely new UMI-filter based on tests run on the reference samples
config/filters/config_hard_filter_umi_vep105.yaml
config/filters/config_soft_filter_umi_vep105.yaml
Changes in output files
See above
Changes in config
Changed:
default_container: "docker://hydragenetics/common:1.10.2"
cnv_html_report:
show_table: true
show_cytobands: true
fusioncatcher:
container: "docker://hydragenetics/fusioncatcher:1.33"
extra: ""
Added:
filter_vcf:
snv_soft_filter_umi: "config/filters/config_soft_filter_umi_vep105.yaml"
vep_wo_pick:
mode: --offline --cache --refseq
extra: " --assembly GRCh37 --check_existing --sift b --polyphen b --ccds --uniprot --hgvs --symbol --numbers --domains --regulatory --canonical --protein --biotype --uniprot --tsl --appris --gene_phenotype --af --af_1kg --af_gnomad --max_af --pubmed --variant_class "
Changes in resources
None
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.5.0 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v1.0.1 (No change)
- filtering: v0.3.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.4.0 (No change)
- fusions: v0.1.0 (First official release of module)
- cnv_sv: v0.3.1 (No change)
- reports: v0.3.1 (Updated CNV html report visualization)
- misc: v0.2.0 (No change)
Snakemake min version
7.18.0 (no change)
Hydra genetics docker version
1.10.2 (new)
Features
- add amino acid change to coverage_and_mutations (a66f501)
- added dedup coverage (01046e6)
- added umi vcf filters (dd1cbe2)
- decrease germline support needed to improve cnv visualization of deletions (c6cd922)
- decrease support needed for germline SNVs (efb2827)
- restructured rna fusion report for improved interpretation (373021c)
- update Jenkins main test (5b30a35)
- update reports module to 0.3.1 (#389) (9346b03)
- update to latest common container (eef00aa)
- updated hydra-genetics version (b7eadaa)
Bug Fixes
- added missing double mutations to report (ed1a526)
- bugfix (e9f2511)
- bugfix (2dbc810)
- bugfixes (7fcd240)
- changed sorting and bugfixes (355d99e)
- corrected filepath for rna in develop (85abf47)
- corrected input file for hotspot_report.smk (8ce6512)
- corrected input variables for hotspot_report.py (076f5fd)
- corrected input variables for hotspot_report.smk (6758fa4)
- corrected results path for soft-filtered umi vcf (a8ed44c)
- delete extra file (2803f67)
- housekeeping dict (7105a88)
- input file bugfix (bfa94a6)
- pulp version (f37c08a)
- remove extra header columns (6bb5d42)
- remove extra sample folder in output (39fefd8)
- remove newline after purity in cnvkit_call command (b820ab5)
- remove unused time consuming step in fusioncatcher (635a40b)
- removed spaces in jenkins test develop (05eb8ef)
- reverse sorting and add chr to fusioncatcher (794f8f7)
- update hydra-genetics to v1.10.1 (be2682d)
- update jenkins start scripts (8cdaf1f)
- update profile with mounted home (7665c8b)
- update profile with mounted home (0b1be40)
- update profile with mounted home (1947e3a)
- update profile with mounted home (b3d56e2)
- updated jenkins output for develop (887584b)
- updated to bugfixed common container (1449abd)
Twist_Solid v0.10.0
Main new features in v0.10.0
- Scripts, configs and documentation for downloading all reference files used by the pipeline
- Restructuring of the pipeline config files
Reference download
All files used by the pipeline can now be downloaded us the hydra-genetics command line tool.
Config restructuring
The config folder has now sub folders to avoid file cluttering.
The main config files have been split up into two configs, one for data files and one for software settings. The file paths in the data config are generalized so that only three different variables are needed to be adjusted, especially if the reference files are downloaded with the reference download tool.
TERT
2 new TERT promoter positions have been granted hotspot status
Fuseq-WES
Filtering of artifact fusion pairs can now be specified in a text file specified in the config.
RNA fusion report
Filtering of false positive fusions and cut-off values are now configurable (for Star-fusion and FusionCatcher) using a text file specified in config. Increased the list of fusions that are filtered from previous release based on more analysed samples.
Features
- 2 new TERT positions added to report (3617acb)
- add blacklist for fuseq_wes filtering (6bc03e3)
- make fusion filtering configurable and add more fp fusions (5ed5c97)
- more conda updates (df2b77c)
- report internal callers for fusioncatcher (d5fccad)
- setup files for reference file validation and fetching using yaml files (7078c04)
- singularity build script update (8a8b665)
- update to latest files used by config (07e351c)
- update to newer hotspot file and clean config (72ef4ff)
- use v0.1.0 of fusion module to filter FP Fuseq_WES calls (3c30fb1)
Bug Fixes
- add lambda to functions so that they are only valuated when used (0b85dda)
- add missing extra option in star-fusion (7476a0d)
- add missing type (35e5f6e)
- change from svdb_query file to svdb_merge file (1ed29ad)
- configs for reference profile (6262775)
- correct path in config (dd63afa)
- handle reference creation with unit file containing both T and N (61b7764)
- increased star_fusions wall time (5ff15de)
- lamda in correct position (128aa8f)
- minor config updates (8296226)
- new rna fusion filter file without duplicate entry and header (280fd4a)
- pycodestyle (36eed32)
- support header with fusions filtering (c8a8bb7)
- updarw config files to match current setup (0e689db)
- update configs and restructure (7f397de)
Documentation
Twist_Solid v0.9.0
Release notes v0.9.0
UMI
Possibility to analyse the data using UMIs. UMI analysis for all or selected samples can be achive by adding an extra column in samples.tsv named deduplication. In this column all samples marked with umi will be run using UMIs. Some results will be additionally be created without using UMIs (bam, vcf, MSI, TMB, MultiQC).
UMI read consensus and filtering are performed by Fgbio.
Reference creation pipeline
Use the coming release v0.10.0 for this.
Changes in output files
Added UMI output files
Modified files as adotion to reworked annotation modules
New reference output files need update, please use next version (v0.10.0)
Changes in config
Added config file for GRCh38
Reworked config_references.yaml
Reworked config_references_hg38.yaml
Added in hg19 config (VEP and UMI-changes):
vep:
mode: --offline --cache --refseq
bwa_mem_realign_consensus_reads:
container: "docker://hydragenetics/fgbio:2.1.0"
fgbio_call_and_filter_consensus_reads:
container: "docker://hydragenetics/fgbio:2.1.0"
max_base_error_rate: "0.2"
min_reads_call: "1 0 0"
min_reads_filter: "1 0 0"
min_input_base_quality_call: 20
min_input_base_quality_filter: 30
fgbio_copy_umi_from_read_name:
container: "docker://hydragenetics/fgbio:2.1.0"
fgbio_group_reads_by_umi:
container: "docker://hydragenetics/fgbio:2.1.0"
umi_strategy: paired
filter_vcf:
snv_hard_filter_umi: "config/config_hard_filter_umi.yaml"
snv_soft_filter_umi: "config/config_soft_filter_umi.yaml"
multiqc:
container: "docker://hydragenetics/multiqc:1.11"
reports:
DNA:
deduplication: ["mark_duplicates"]
DNA_umi:
config: "config/multiqc_config_dna.yaml"
included_unit_types: ["N", "T"]
deduplication: ["umi"]
qc_files:
- "qc/fastqc/{sample}{type}{flowcell}{lane}{barcode}fastq1_fastqc.zip"
- "qc/fastqc/{sample}{type}{flowcell}{lane}{barcode}fastq2_fastqc.zip"
- "qc/picard_collect_alignment_summary_metrics/{sample}{type}.alignment_summary_metrics.txt"
- "qc/picard_collect_duplication_metrics/{sample}{type}.duplication_metrics.txt"
- "qc/picard_collect_hs_metrics/{sample}{type}.HsMetrics.txt"
- "qc/picard_collect_insert_size_metrics/{sample}{type}.insert_size_metrics.txt"
- "qc/samtools_stats/{sample}{type}.samtools-stats.txt"
- "qc/gatk_calculate_contamination/{sample}{type}.contamination.table"
- "alignment/fgbio_group_reads_by_umi/{sample}_{type}.umi.histo.tsv"
samtools_merge_bam_umi:
extra: "-c -p"
tmb_umi:
af_lower_limit: 0.000
af_upper_limit: 0.994
af_germline_lower_limit: 0.3
af_germline_upper_limit: 0.7
artifacts: ""
background_panel: ""
db1000g_limit: 0.0001
dp_limit: 300
filter_genes: "/projects/wp1/nobackup/ngs/utveckling/Twist_DNA_DATA/tmb_filter_genes.txt"
gnomad_limit: 0.0001
vd_limit: 0
nr_avg_germline_snvs: 0.0
nssnv_tmb_correction: 0.84
vardict:
allele_frequency_threshold_umi: "0.001"
Changes in resources (UMI + reference creation)
New reference resource file
Added UMI to ordinary resources file:
bwa_mem_realign_consensus_reads:
mem_mb: 61440
mem_per_cpu: 6144
threads: 10
time: "8:00:00"
fgbio_call_and_filter_consensus_reads:
threads: 3
mem_mb: 18432
mem_per_cpu: 6144
time: "8:00:00"
fgbio_group_reads_by_umi:
time: "8:00:00"
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.5.0 (Adding support for UMIs using Fgbio)
- snv_indels: v0.3.0 (No change)
- annotation: v1.0.1 (Annotation order not fixed in module. Update pipeline to reflext this)
- filtering: v0.3.0 (Additional filtering features not affecting the pipeline)
- qc: v0.3.0 (No change)
- biomarker: v0.4.0 (No change)
- cnv_sv: v0.3.1 (No change)
- reports: v0.2.0 (No change)
Snakemake min version
7.18.0
Features
- 2 new TERT promoter variants (ee72af1)
- add umi support for cnvkit (979c816)
- add umi support for fuseq wes (a4d7139)
- add umi support for fuseq wes (522ac37)
- add umi support for gatkcnv (a2ce975)
- add umi support for hrd and msi (20560a2)
- add umi support for manta (8a703d9)
- add umi support for qc (89f673b)
- add umi support for tmb (9683d6b)
- added tmb gene filter. TMB uses hard filtered file for input (9e242a3)
- added umi choice to the pipeline (8e13b38)
- changes hard filtering to let more variants through (70606f3)
- hard filter for qci (9ae0a2f)
- min vaf configurable in vardict (303b5f7)
- rm need for ruleorder for copy rules using global wildcard constraints (ad15f6c)
- run msi w and wo umi (6766b18)
- umi in rna (6ccba4c)
- umi vcf filtering based on sample.tsv (4fdd999)
- update alignment module tag (ed33d45)
- update to v1.0.0 relaese of annotation (3fd0c7b)
- Update workflow/Snakefile with new alignment tag (e74398e)
- updated alignment module tag (62500dd)
- updated alignment module tag (341a3ba)
- updated module versions and adapted to these new versions (c4aa705)
Bug Fixes
- adapt to new umi alignment module (60a90b3)
- adopt to breaking change in vep rule (96b4b26)
- bugfix in get_vardict_min_af (cd503b7)
- bugfix in get_vardict_min_af (44d4b59)
- corrected gatk_mutect2 input files (442d62c)
- fix manta output files and rm unneaded umi rules in Snakefile (5ec7a33)
- fixed units.tsv and adaption of reference pipeline to new annotation module (9d0c926)
- gvcfs now have mosdepth umi coverage (cf3e9c6)
- manta to use original name, filtering wo <=, msi-sensor w correct output name (0a9960b)
- match output result from reference module (e2415d5)
- mosdepth should be run on entire bam file ([8643bb2](https://www.github.com/genomic-medicine-sweden/Twist...
Twist_Solid v0.8.0
Release notes
Bugfixes
Missing bam index files are now retained in results when running the analysis without --notemp
Missing input file for cnv-html-report added so that the report works when running the analysis without --notemp
Features
TMB
When running on novaseq and nextseq2000, artifacts in CDC27 and MUC6 are commonly not filtered out, inflating the TMB-numbers. A new filtering option are therefore added where these two genes are completely filtered out. This does not affect the correlation or slope which was calculated using samples run on nextseq550.
Arriba
Arribas performance is, in our data, better on 100bp than on 150bp reads. Therefore the reads are trimmed with fastp to 100bp before calling fusions with Arriba. All other fusion callers are unaffected.
Changes in config
Added:
TMB:
filter_genes: "{path}/tmb_filter_genes.txt"
Changes in resources
fastp_pe_arriba:
- threads: 5
- mem_mb: 30720
- mem_per_cpu: 6144
Documenation (read the docs)
Updated links to hydra modules with documantion.
Updated files to use for ference creation
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.3.1 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v0.3.0 (No change)
- filtering: v0.1.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.4.0 (Gene filtering)
- cnv_sv: v0.3.1 (No change)
- reports: v0.2.0 (Bugfix: missing input file, and some improvements)
Features
- tmb two noisy genes filtered (45397d5)
- trim reads to 100bp for improved results in Arriba (aa06413)
- update biomarker module version (d9f0059)
- update reports module to v0.2.0 (3f9be85)
Bug Fixes
- added modified input function to cnv_html_report to fix notemp bug (b535c42)
- keep bai files (753b789)
- merge two rule definitions into one (e294971)
- update snakemake version to avoid checkpoint restart job bug (67f1bf6)
Documentation
- correct rtd links (46de326)
- correct rtd links (f793098)
- correct rtd links (51c0a55)
- correct rtd links (8e6a3be)
- correct rtd links (216e0b2)
- correct rtd links (20b021c)
- correct rtd links (497db09)
- update PoN description (92db1ee)
- update readthedocs links (aac222f)
- update rtd links (0347a62)
- update rtd links (576112f)
- update rtd links (4f07de7)
- update rtd links (6f99f6b)
- update rtd links (2c353c2)
- update rtd links (6a1c7bf)
- update rtd links (238d74b)
- update rtd links (80b0637)
- update rtd links (b2e3967)
hydra-genetics v0.7.0
Release notes
For more details on features and bug fixes see further down.
Features
CNV Filtering
Filtering using frequency in sample database removed true whole chromosome deletions. Therefore the frequency filtering is now only done on segments that are smaller than 10Mb.
CNV Reporting
The CNV html report is moved out from the pipeline and moved into a new reporting module in Hydra-genetics. The report itself are not changed except for minor improvements.
CNV PureCN
PureCN is now producing reliable results. However, it does not handle low TC samples and samples with few copy number abberations. Both of these will in general get underestimated TC in the range of 15-32%. Therefore a new combined CNV html and tsv report is now generated that uses the PureCN TC if it is above 35% and the pathology estimated TC otherwise. These files have the tag pathology_purecn in the results. The results for pathology and pureCN only are placed in additional files under results.
PureCN uses a new filter (snv_hard_filter_purecn) to get its vcf input file.
RNA MultiQC
The bam-files produced by the Star aligner are now duplicate marked by picard. This is only used for for QC and the duplication rate is reported in the RNA MultiQC report.
Changes in config.yaml
Changed:
- output: "config/output_list.json" => output: "config/output_files.yaml" #New output file format
- cnv_html_report:
show_table: true
template_dir: config/cnv_report_template - design_intervals_rna: "/projects/wp1/nobackup/ngs/utveckling/Twist_RNA_DATA/bed/Twist_RNA_Design5.annotated.20230630.interval_list" #New file for duplication QC
- report_fusions: #Corrected spelling of fusioncatcher, only changes shown
fusioncatcher_flag_low_support: 15
fusioncatcher_low_support: 3
fusioncatcher_low_support_fp_genes: 20
fusioncatcher_low_support_inframe: 6
Added:
- merge_cnv_json:
annotations:
- /references/cnv_amp_genes.bed
- /references/cnv_loh_genes.bed
filtered_cnv_vcfs:
- cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_amp_genes.filter.cnv_hard_filter_amp.vcf.gz
- cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_loh_genes_all.filter.cnv_hard_filter_loh.vcf.gz
unfiltered_cnv_vcfs:
- cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_amp_genes.vcf.gz
- cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_loh_genes_all.vcf.gz
germline_vcf: snv_indels/bcbio_variation_recall_ensemble/{sample}_{type}.ensembled.vep_annotated.filter.germline.exclude.blacklist.vcf.gz - filter_vcf:
snv_hard_filter_purecn: "config/config_hard_filter_purecn.yaml" #New filter for purecn - svdb_merge:
tc_method:
- name: pathology_purecn #new combined tag
cnv_caller:
- cnvkit
- gatk
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.3.1 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v0.3.0 (No change)
- filtering: v0.1.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.3.1 (No change)
- cnv_sv: v0.3.1 (No change)
- reports: v0.1.0 (New module, CNV html report moved here)
Features
- add result file for combined purecn and pathology (fb17cc2)
- added duplication % to multiQC (2923ebf)
- added picard mark duplicates of bam-files for QC (0f33656)
- added read group function for STAR (c224c17)
- added RG to STAR and changed bam file for QC (1f68585)
- added rule for modifying MBQ in vcf (72dc309)
- added rule for modifying MBQ in vcf (6c26626)
- added rule for modifying MBQ in vcf (188f418)
- change pureCN cutoff to 0.35 (4e06643)
- choose purecn if tc > 30% and pathology otherwise (20292ab)
- harder filtering (7cae319)
- make two tsv reports using different gene lists (0737c60)
- test_input_all.tsv for v0.7.0 (27df350)
- test_input_VAL2022.tsv for v0.7.0 (26f5157)
- use filtered vcf with both germline and somatic variants (f6c5cc3)
- use gatk2 for purecn (ee2e2bc)
- use germline vcf for purecn (eb0eaf5)
- use purity file directly from purecn to also get ploidity (b397a3c)
- use vaf and snv filtered vcf with both germline and somatic variants (48f3563)
Bug Fixes
- add germline flag to vcf (1e8de1b)
- add missing filter tag (7c01e2d)
- annotate using missing sites instead (de172b0)
- bug fixes (224a6a4)
- change checkpoint to rule (5e04c9a)
- change path to new normals (7a3b420)
- correct header in cnv report file (c947152)
- correct output name for purecn reference (92af141)
- correct rule import from wrong module (275a60d)
- delegate schema validation to reports module (b5dada0)
- do not filter large cnvs based on frequency in database (4b86637)
- get correct tc to html report (c394cba)
- handle empty purecn file (df89fe5)
- import spelling mistake (f3e7248)
- moved result file to additional files (c6c7574)
- properly overrule the
get_tc
function (701ceb8) - purecn_modify_vcf bugfix (e910e18)
- redefine rule to use new params in config (5ef5dac)
- return correct tc (45485c7)
- solve different wildcards in rule error (62082a1)
- spelling error of Exception (8d6fbf9)
- tabix of annotation database (1f70635)
- use correct genome (66e3c8b)
- use correct get_tc (...
hydra-genetics v0.6.1
Release notes
For more details on features and bug fixes see further down.
Features
###Documentation
First relase of read the docs (https://twist-solid.readthedocs.io/en/v0.6.1/)
CNV
- New small cnv amplification caller (in house script) for the 20 relevant amplification genes which is very similar to the small deletion caller.
- The CNV.tsv report now also includes amplifications called by the small amplification caller
Bugfixes
- New lines added to variants in the TMB report
Changes in config.yaml
- Small amplification caller: new in config
- CNV tsv report: added option for small amplification filtering
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.3.1 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v0.3.0 (No change)
- filtering: v0.1.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.3.1 (fixed TMB report bug)
- cnv_sv: v0.3.1 (No change)
Features
hydra-genetics v0.5.0
Release notes
For more details on features and bug fixes see further down.
Features
CNV
- The CNV.hmtl report now reports TC content
- The CNV.hmtl report now reports VAF values in the variant table
- The CNV.tsv report now also includes deletions called by the small deletions caller
TMB
- Improved TMB-calculations. Finds more true variants and have better correlation compared to TSO500. Does not use any panel of normals anymore making the calculations more independent on sequencing platform.
DNA fusions
- Added DNA fusion calling using Fuseq-WES with superior results compared to GeneFuse
- GeneFuse: Added filtering of the ERG gene
RNA exon skipping
- Only report MET exon 14 skipping and EGFRvIII and not other potential skipping events in these genes
Bugfixes
- Copy .bai file with timestamp instead of creating it so that it is not removed by snakemake
Changes in config.yaml
- TMB: new and updated config options for tmb rule
- FuseqWES: Added config for fuseq_wes rule
- FuseqWES filtering: Added config for filter_fuseq_wes rule
Hydra modules with releases
- prealignment: v1.0.0 (No change)
- alignment: v0.3.1 (No change)
- snv_indels: v0.3.0 (No change)
- annotation: v0.3.0 (No change)
- filtering: v0.1.0 (No change)
- qc: v0.3.0 (No change)
- biomarker: v0.3.0 (TMB updated with more config options)
- cnv_sv: v0.3.1 (No change)