How to download open data & update annotation files

Jump to bottom

Damianos P. Melidis edited this page Nov 22, 2021 · 1 revision

How to build annotation files

GenOtoScope (genotoscope_classify.py) needs several open data sets and annotation files.

The annotation files are build by GenOtoScope utilities functions.

In the following, we list the open data and annotation files, indicating how to update them as improved open data sets are provided.

Open data

ClinVar --> ClinVar
gnomAD --> gnomAD
UniProt --> UniProt
Inheritance modes for HL genes:

Currently, using the curated genes by DOI:10.1038/s41436-019-0487-0

File path: /path2genotoscope_data/genotoscope_data/gene_inheritance_patterns_clingen_mhh.xlsx

Annotation files

HGNC --> HGNC
HL variants with high AF:

If more such variants are approved by ClinGen, update the files:

/path2genotoscope_data/genotoscope_data/hl_variants.vcf.gz

/path2genotoscope_data/hl_variants.vcf.gz.tbi
PM1 variants: If more hotspot variants are approved for congenital hearing loss by CliGen please update the file:

/path2genotoscope_data/annotation_beds/hearing_loss/pm1_regions_hl_hg19.bed
Critical regions of proteins --> Critical regions
Clinical significant exons --> Clinical significant exons
HL-relevant transcripts and exons --> HL-relevant transcripts

GenOtoScope Wiki Home