Skip to content

barricklab/breseq-ext-cnv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

breseq-ext-cnv

breseq copy number variation extension accepts .tab coverage output from breseq BAM2COV and predicts copy number variations across the genome after correcting the biases in read counts introduced by variations in sequencing methods.

Installation:

Recommended: Create conda python environment.

conda create -n <env-name> python>=3.9
conda activate <env-name>

Install breseq-ext-cnv

pip install git+https://github.com/barricklab/breseq-ext-cnv.git

Run:

Run BAM2COV on breseq output to get the coverage table:

breseq bam2cov -t[--table] --resolution 0 (0=single base resolution) --region <reference:START-END> --output <filename>

With the coverage table as the input determine regions of copy number variation using:

breseq-ext-cnv -i <input file> [-o <output folder location>] [-w <window>] [-s <step size>]

Run examples:

breseq-ext-cnv -i <input file>
# calculate coverage with a window size of 500 and a increment of 250 with average sequencing fragment length of 300bp
breseq-ext-cnv -i <input file> -w 500 -s 250 -f 300
# output copy number prediction and coverage plots of a specific genomic segment
breseq-ext-cnv -i <input file> --region 3497890-3955678 -w 1000 -s 500
#
$breseq-ext-cnv -h

usage: get_CNV.py [-h] -i I [-o O] [-w W] [-s S] [-ori ORI] [-ter TER] [-f F] [-e E]

The breseq-ext-cnv is python package extension to breseq that analyzes the sequencing coverage across the genome to determine specific regions that have undergone copy number variation (CNV)

options:
  -h , --help          show this help message and exit
  -i , --input        input .tab file address from breseq bam2cov.
  -o , --output       output file location preference. Defaults to the current folder.
  -w , --window       Define window length to parse through the genome and calculate coverage and GC statistics.
  -s , --step-size    Define step size (<= window size) for each progression of the window across the genome sequence. 
                      Set = window size if non-overlapping windows.
  --region            Set regions between which to display output coverage plots.
  -ori , --origin     Genomic coordinate for origin of replication.
  -ter , --terminus   Genomic coordinate for terminus of replication.
  -f , --frag_size    Average fragment size of the sequencing reads.
  -e , --error-rate   Error rate in sequencing read coverage, taken into account to accurately determine 0 copy coverage.

Input .tab file from breseq bam2cov. To get the coverage file run the command below in your breseq directory which contains the 'data' and 'output' folders.
breseq bam2cov -t[--table] --resolution 0 (0=single base resolution) --region <reference:START-END> --output <filename>

About

breseq copy number variation extension

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages