Skip to content
This repository has been archived by the owner on May 5, 2021. It is now read-only.

Gene Disease Associations

Gautier Koscielny edited this page Jul 2, 2018 · 7 revisions

Open Targets gene-disease associations per datatype and per datasource

gene_disease_associations.csv.gz

Number of rows x columns: 2,405,594 rows x 33 columns

This file is a dump of the Open Targets database summarising all the gene-disease associations per type of evidence (genetics, somatic mutations, transcriptomic studies, clinical trials, affected pathways, disease relevant animal model, text-mining) and per source of evidence (Expression Atlas,uniprot,gwas catalog,PheWAS Catalog,eva,Uniprot literature,Genomics England, gene2phenotype, Reactome,SlapEnrich,PROGENy, Phenodigm, cancer gene census, EVA somatic, UniProt Somatic, Intogen, ChEMBL, Europe PMC)

Column name Description
target_indication target-disease pair
ensembl_gene_id Ensembl gene identifier
symbol gene symbol
disease_id disease identifier
disease_label disease/GWAS trait/phenotype name
therapeutic_area therapeutic area for this disease/trait/phenotype, e.g., metabolic disease; genetic disorder
is_direct Is the association drawn from a direct evidence or propagated based on the disease classification?
overall_score overall score of the association (aggregate the others)
genetic_association genetic association score
somatic_mutation somatic mutation score (from all somatic datasources)
known_drug Clinical trial score based on ChEMBL evidence
rna_expression mRNA differential expression score
affected_pathway Affected pathways score (combines Reactome and SlapEnrich)
animal_model Animal model score based on Phenodigm
literature Europe PMC score
expression_atlas Expression Atlas association score
uniprot UniProt genetic score
gwas_catalog GWAS Catalog genetic score
phewas_catalog PheWAS Catalog genetic score
eva EVA (ClinVar) genetic score
uniprot_literature UniProt literature curated genetic score
genomics_england Genomics England PanelApp genetic score
gene2phenotype Gene2Phenotype genetic score
reactome Reactome affected pathways score
slapenrich SlapEnrich cancer affected pathways score
progeny PROGENy signaling pathways score
phenodigm Phenodigm (Animal model) score
cancer_gene_census Cancer Gene Census score
eva_somatic EVA (ClinVar) somatic mutations score
uniprot_somatic UniProt somatic mutations score
intogen InToGEN cancer driver gene score
chembl ChEMBL clinical trial score
europepmc EuroPMC literature score
head -10 gene_disease_associations.csv
key,ensembl_gene_id,symbol,disease_id,disease_label,therapeutic_area,is_direct,overall_score,genetic_association,somatic_mutation,known_drug,rna_expression,affected_pathway,animal_model,literature,expression_atlas,uniprot,gwas_catalog,phewas_catalog,eva,uniprot_literature,genomics_england,gene2phenotype,reactome,slapenrich,progeny,phenodigm,cancer_gene_census,eva_somatic,uniprot_somatic,intogen,chembl,europepmc
ENSG00000065485-Orphanet_3389,ENSG00000065485,PDIA5,Orphanet_3389,Tuberculosis,infectious disease,True,5.7590800000000004e-05,0.0,0.0,0.0,5.7590800000000004e-05,0.0,0.0,0.0,5.7590800000000004e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
ENSG00000145431-HP_0001392,ENSG00000145431,PDGFC,HP_0001392,Abnormality of the liver,phenotype,False,0.03802784199546485,0.0,0.0,0.0,0.0,0.0,0.0,0.03802784199546485,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03802784199546485
ENSG00000113231-EFO_0000546,ENSG00000113231,PDE8B,EFO_0000546,injury,other disease,True,0.2,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0
ENSG00000163114-Orphanet_98497,ENSG00000163114,PDHA2,Orphanet_98497,Genetic peripheral neuropathy,genetic disorder,False,0.03431711111111112,0.0,0.0,0.0,0.0,0.0,0.0,0.03431711111111112,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03431711111111112
ENSG00000160613-HP_0000707,ENSG00000160613,PCSK7,HP_0000707,abnormality of the nervous system,phenotype,False,0.028066930555555553,0.0,0.0,0.0,0.0,0.0,0.0,0.028066930555555553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028066930555555553
ENSG00000154678-EFO_0002970,ENSG00000154678,PDE1C,EFO_0002970,muscular disease,skeletal system disease,False,0.125,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0
ENSG00000160191-EFO_0000712,ENSG00000160191,PDE9A,EFO_0000712,stroke,cardiovascular disease; other disease,True,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
ENSG00000154678-EFO_0004247,ENSG00000154678,PDE1C,EFO_0004247,mood disorder,nervous system disease; other disease,False,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0.0
ENSG00000163114-Orphanet_68367,ENSG00000163114,PDHA2,Orphanet_68367,Inborn errors of metabolism,metabolic disease; genetic disorder,False,0.03431711111111112,0.0,0.0,0.0,0.0,0.0,0.0,0.03431711111111112,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03431711111111112