Skip to content

hydra-genetics v0.7.0

Compare
Choose a tag to compare
@github-actions github-actions released this 18 Aug 06:37
103ea53

Release notes

For more details on features and bug fixes see further down.

Features

CNV Filtering

Filtering using frequency in sample database removed true whole chromosome deletions. Therefore the frequency filtering is now only done on segments that are smaller than 10Mb.

CNV Reporting

The CNV html report is moved out from the pipeline and moved into a new reporting module in Hydra-genetics. The report itself are not changed except for minor improvements.

CNV PureCN

PureCN is now producing reliable results. However, it does not handle low TC samples and samples with few copy number abberations. Both of these will in general get underestimated TC in the range of 15-32%. Therefore a new combined CNV html and tsv report is now generated that uses the PureCN TC if it is above 35% and the pathology estimated TC otherwise. These files have the tag pathology_purecn in the results. The results for pathology and pureCN only are placed in additional files under results.
PureCN uses a new filter (snv_hard_filter_purecn) to get its vcf input file.

RNA MultiQC

The bam-files produced by the Star aligner are now duplicate marked by picard. This is only used for for QC and the duplication rate is reported in the RNA MultiQC report.

Changes in config.yaml

Changed:

  • output: "config/output_list.json" => output: "config/output_files.yaml" #New output file format
  • cnv_html_report:
    show_table: true
    template_dir: config/cnv_report_template
  • design_intervals_rna: "/projects/wp1/nobackup/ngs/utveckling/Twist_RNA_DATA/bed/Twist_RNA_Design5.annotated.20230630.interval_list" #New file for duplication QC
  • report_fusions: #Corrected spelling of fusioncatcher, only changes shown
    fusioncatcher_flag_low_support: 15
    fusioncatcher_low_support: 3
    fusioncatcher_low_support_fp_genes: 20
    fusioncatcher_low_support_inframe: 6

Added:

  • merge_cnv_json:
    annotations:
    - /references/cnv_amp_genes.bed
    - /references/cnv_loh_genes.bed
    filtered_cnv_vcfs:
    - cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_amp_genes.filter.cnv_hard_filter_amp.vcf.gz
    - cnv_sv/svdb_query/{sample}
    {type}.{tc_method}.svdb_query.annotate_cnv.cnv_loh_genes_all.filter.cnv_hard_filter_loh.vcf.gz
    unfiltered_cnv_vcfs:
    - cnv_sv/svdb_query/{sample}{type}.{tc_method}.svdb_query.annotate_cnv.cnv_amp_genes.vcf.gz
    - cnv_sv/svdb_query/{sample}
    {type}.{tc_method}.svdb_query.annotate_cnv.cnv_loh_genes_all.vcf.gz
    germline_vcf: snv_indels/bcbio_variation_recall_ensemble/{sample}_{type}.ensembled.vep_annotated.filter.germline.exclude.blacklist.vcf.gz
  • filter_vcf:
    snv_hard_filter_purecn: "config/config_hard_filter_purecn.yaml" #New filter for purecn
  • svdb_merge:
    tc_method:
    - name: pathology_purecn #new combined tag
    cnv_caller:
    - cnvkit
    - gatk

Hydra modules with releases

  • prealignment: v1.0.0 (No change)
  • alignment: v0.3.1 (No change)
  • snv_indels: v0.3.0 (No change)
  • annotation: v0.3.0 (No change)
  • filtering: v0.1.0 (No change)
  • qc: v0.3.0 (No change)
  • biomarker: v0.3.1 (No change)
  • cnv_sv: v0.3.1 (No change)
  • reports: v0.1.0 (New module, CNV html report moved here)

Features

  • add result file for combined purecn and pathology (fb17cc2)
  • added duplication % to multiQC (2923ebf)
  • added picard mark duplicates of bam-files for QC (0f33656)
  • added read group function for STAR (c224c17)
  • added RG to STAR and changed bam file for QC (1f68585)
  • added rule for modifying MBQ in vcf (72dc309)
  • added rule for modifying MBQ in vcf (6c26626)
  • added rule for modifying MBQ in vcf (188f418)
  • change pureCN cutoff to 0.35 (4e06643)
  • choose purecn if tc > 30% and pathology otherwise (20292ab)
  • harder filtering (7cae319)
  • make two tsv reports using different gene lists (0737c60)
  • test_input_all.tsv for v0.7.0 (27df350)
  • test_input_VAL2022.tsv for v0.7.0 (26f5157)
  • use filtered vcf with both germline and somatic variants (f6c5cc3)
  • use gatk2 for purecn (ee2e2bc)
  • use germline vcf for purecn (eb0eaf5)
  • use purity file directly from purecn to also get ploidity (b397a3c)
  • use vaf and snv filtered vcf with both germline and somatic variants (48f3563)

Bug Fixes

  • add germline flag to vcf (1e8de1b)
  • add missing filter tag (7c01e2d)
  • annotate using missing sites instead (de172b0)
  • bug fixes (224a6a4)
  • change checkpoint to rule (5e04c9a)
  • change path to new normals (7a3b420)
  • correct header in cnv report file (c947152)
  • correct output name for purecn reference (92af141)
  • correct rule import from wrong module (275a60d)
  • delegate schema validation to reports module (b5dada0)
  • do not filter large cnvs based on frequency in database (4b86637)
  • get correct tc to html report (c394cba)
  • handle empty purecn file (df89fe5)
  • import spelling mistake (f3e7248)
  • moved result file to additional files (c6c7574)
  • properly overrule the get_tc function (701ceb8)
  • purecn_modify_vcf bugfix (e910e18)
  • redefine rule to use new params in config (5ef5dac)
  • return correct tc (45485c7)
  • solve different wildcards in rule error (62082a1)
  • spelling error of Exception (8d6fbf9)
  • tabix of annotation database (1f70635)
  • use correct genome (66e3c8b)
  • use correct get_tc (ba5272e)
  • use correct interval file (92c0e41)
  • use Illumina for platform (130f3a2)

Documentation

  • add readthedocs link to readme (7c61ef3)
  • update CNV HTML report documentation (3478612)
  • update mention of output spec in docs (60b9800)
  • update readme (8673794)
  • updated cnv documentation (a503ddd)
  • updated cnv documentation (ef6786a)