Skip to content

Latest commit

 

History

History
33 lines (27 loc) · 4.02 KB

phase-3-structural-variant-dataset.md

File metadata and controls

33 lines (27 loc) · 4.02 KB
layout title permalink tags
single_section
Phase 3 structural variant dataset
/phase-3-structural-variant-dataset/
Phase 3 structural variant dataset

#The phase 3 structural variant dataset

The 1000 Genomes Project SV group produced an expanded dataset of structural variation for the individuals in phase 3 of the 1000 Genomes Project.

The VCF files for the SV dataset in GRCh37 coordinates can be found in ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/. This directory contains a README which explains the contents of the VCF files and supporting information, and provides a complete list of the differences between the 1000 Genomes Project Consortium Phase 3 paper and the Structural Variation Consortium Companion paper.

The 1000 Genomes Structural Variation dataset is built and validated on several different raw datasets.

{: .table .table-striped}

Type Archive accession Data on the FTP site
Short-read Illumina WGS sequencing * phase3/20130502.phase3.analysis.sequence.index
Complete Genomics WGS sequencing * phase3/20130725.phase3.cg_sra.index
PCR-free Illumina WGS sequencing SRP047053 release/20130502/../high_.._alignments/20141118_high_coverage.alignment.index
Moleculo WGS NA12878 phase3/integrated_sv_map/supporting/NA12878/moleculo
PacBio sequencing NA12878 SRX638310 phase3/integrated_sv_map/supporting/NA12878/pacbio
PacBio sequencing CHM1 SRX533609 phase3/integrated_sv_map/supporting/CHM1
Agilent 1M aCGH microarray GSE70188 phase3/integrated_sv_map/supporting/acgh/
Illumina Omni2.5 microarray release/20130502/supporting/hd_genotype_chip/
Affymetrix SNP Array 6.0 release/20130502/supporting/hd_genotype_chip/coriell_affy6_intensities/
Targeted PacBio sequencing ERS661321, ERS661355, ERS661356, ERS661358+
Targeted MinION sequencing ERS661358, ERS661406+
  • Rows with '*' represent many differents archive runs, experiments and accessions. A full set or archive accessions can be found in the index files the FTP links point to.
  • Rows with '+' were all submitted together under the same study ERP009552.

The Phase 3 Structural variants can also be found mapped to GRCh38 coordinates in the FTP directory ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/supporting/GRCh38_positions/.