Skip to content

Detect low coverage regions

Andrea Telatin edited this page Aug 18, 2020 · 2 revisions

Producing a BED file of uncovered regions

A common task in target enrichments experiments is to understand what regions have not been covered. Using covtobed and bedtools we can produce some useful output.

  • Extract uncovered regions from a single BAM file:

Here we detect all the regions covered between 0 (inclusive) and 1 (exclusive) in a sample alignment, and flatten the output with bedtools, if needed:

covtobed -m 0 -x 1 sample1.bam | bedtools merge > sample1.0X.bed
  • Process multiple samples

If we have multiple experiments we can use a loop to produce a 0X.bed file for each sample:

for FILE in *.bam; do
  covtobed -m 0 -x 1 $FILE | bedtools merge > ${FILE/bam/0X.bed}
done
  • What target regions were uncovered in a specific sample?

We can then intersect the 0X coverage file with the desired target (e.g. exome):

bedtools intersect -a target.bed -b sample.0x.bed > sample.0X_target.bed
  • What regions where systematically uncovered in all the samples?

To obtain a report of the uncovered regions and understand how many samples share the same regions:

multiIntersectBed -i *.0X_target.bed > report_0X_target.bed
Clone this wiki locally