Skip to content

Latest commit

 

History

History
13 lines (10 loc) · 1.16 KB

README.md

File metadata and controls

13 lines (10 loc) · 1.16 KB

bisulfiteBlast

These scripts allow to use blast with bisulfite converted sequences (i.e. sequences in which most Cs have been converted to Ts). To this end, a blast db is created in which all Cs have been replaced by Ts and also in the query sequences all Cs are replaced by Ts. This turned out to work better than using an ambiguity code. The creation of this "converted" blast db is described in createDB. Or you can use the precreated blastDB[25GB]

The intended application is to confirm the species annotation of samples by identifying the "best hits" in the blast nucleotide database for a number of randomly sample reads from reduced representation bisulfite sequencing (RRBS) data. It is intended to work well with the RefFreeDMA pipeline.

Requirements:

  1. R/ Rscript with packages data.table and ggplot2
  2. blast
  3. samtools
  4. BLASTDB environment variable: export BLASTDB=$BLASTDB:"<you path>/blastDB/nt_conv":"<you path>/blastDB/taxdb"