Skip to content

Latest commit

 

History

History

benchmarks

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Pipeline for variant benchmarking of Chinese Quartet.

We develop a pipeline for benchmarking of variants in the genomes of Chinese Quartet, enabling evaluation of the performance of different sequencing technologies, variant calling algorithms, and pipelines.

Quick start

Run with snakemake

1. Prepare you environment

The following software/packages are required in same environment:

  • python
  • snakemake
  • pysam
  • numpy
  • pandas

You can use conda to install all of these packages, for example:

 conda install package-name 

You also need to install the following software:

  • bedtools
  • bcftools
  • tabix
  • bgzip
  • hap.py (for small variants benchmarking)
  • truvari (for structural variants benchmarking)

2. Prepare the benchmark regions and variants

Download the latest version of variants and benchmark regions of Chinese Quartet according to the instruction.

3. Config you task

  • Config your own config.yaml according to the template.
  • Config your own vcf file in a tsv (Tab-Separated-Values) file according to the template.

4. Run the pipeline

Run the piepline with snakemake

snakemake -s ./Sankefile -j 40 -k --ri # on a local computer
snakemake -s ./Sankefile -j 10 -k --ri  --cluster 'qsub -l nodes=1:ppn=12 -l walltime=99:00:00' >sublog 2>&1 & # on a cluster 

Run with docker

Under construction!

Citation

Jia P, Dong L, Yang X, Wang B, Wang T, Lin J, Wang S, Zhao X, Xu T, Che Y, et al: Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet. bioRxiv 2022:2022.2009.2008.504083. PDF

Contact