Skip to content

How to download open data & update annotation files

Damianos P. Melidis edited this page Nov 22, 2021 · 1 revision

How to build annotation files

GenOtoScope (genotoscope_classify.py) needs several open data sets and annotation files.

The annotation files are build by GenOtoScope utilities functions.

In the following, we list the open data and annotation files, indicating how to update them as improved open data sets are provided.

Open data

  • ClinVar --> ClinVar

  • gnomAD --> gnomAD

  • UniProt --> UniProt

  • Inheritance modes for HL genes:

    Currently, using the curated genes by DOI:10.1038/s41436-019-0487-0

    File path: /path2genotoscope_data/genotoscope_data/gene_inheritance_patterns_clingen_mhh.xlsx

Annotation files

  • HGNC --> HGNC

  • HL variants with high AF:

    If more such variants are approved by ClinGen, update the files:

    /path2genotoscope_data/genotoscope_data/hl_variants.vcf.gz

    /path2genotoscope_data/hl_variants.vcf.gz.tbi

  • PM1 variants: If more hotspot variants are approved for congenital hearing loss by CliGen please update the file:

    /path2genotoscope_data/annotation_beds/hearing_loss/pm1_regions_hl_hg19.bed

  • Critical regions of proteins --> Critical regions

  • Clinical significant exons --> Clinical significant exons

  • HL-relevant transcripts and exons --> HL-relevant transcripts