-
Notifications
You must be signed in to change notification settings - Fork 0
gnomAD data
-
If not existing, create gnomAD subfolder in genotoscope_data:
cd path2analysis_root/genotoscope_data
mkdir gnomAD
-
create subfolder for used version (GenOtoScope currently uses v2.1 for GRCh37/hg19):
cd gnomAD
mkdir gnomAD_v2.1
-
download files from gnomAD downloads:
cd gnomAD_v2.1
wget https://storage.googleapis.com/gcp-public-data--gnomad/release/2.1.1/vcf/exomes/gnomad.exomes.r2.1.1.sites.vcf.bgz
wget https://storage.googleapis.com/gcp-public-data--gnomad/release/2.1.1/vcf/exomes/gnomad.exomes.r2.1.1.sites.vcf.bgz.tbi
-
rename bgz to gz (to be able to use pyVCF library without errors):
mv gnomad.exomes.r2.1.1.sites.1.vcf.bgz gnomad.exomes.r2.1.1.sites.4.vcf.gz
mv gnomad.exomes.r2.1.1.sites.1.vcf.bgz.tbi gnomad.exomes.r2.1.1.sites.1.vcf.gz.tbi
-
Navigate to gnomAD downloads
-
Select gnomAD version for GRCh37
-
Make current directory the genotoscope_data subfolder with the respective version of gnomAD:
cd /path2genotoscope_data/gnomAD/gnomAD_v2.1
-
Download all pLoF variants:
wget https://storage.googleapis.com/gcp-public-data--gnomad/papers/2019-flagship-lof/v1.0/gnomad.v2.1.1.all_lofs.txt.bgz