- multi-anchors filtering and multi-threads implementation for vcf filtering
- for each
anchorcan include severalsub-anchors. - in singel anchor,
passif all sub-anchors satisfied.
- for each
- Python 3
- vcf
- decomposed & normalized by vt or bcftools
- annotated by annovar (.vcf, not .txt)
git clone https://github.com/shanghungshih/vcf-filter.git-v--vcfs: the vcfs file which seperate by ','-a--anchors: the information of the counters-t--thread: pool size for multi-thread importing (default: 1)--write2file:to be update
- without write2file
python3 vcf-filter.py -v sample1.hg19_multianno.vcf,sample2.hg19_multianno.vcf -a anchors/anchors-basic.json -t 2
- with write2file
python3 vcf-filter.py -w true -v sample1.hg19_multianno.vcf,sample2.hg19_multianno.vcf -a anchors/anchors-PG-853variant.json
- in anchors, define every
anchor namewhich will show in results. - for each anchor, please define:
key: keys presents in info column of annovar-annotated vcf (ex. Func.refGene=TP53;AF=0.001;), and for variant comparison, usingvariantfor key name.type: operator to perform comparison (valid types:==,>=,<=,>,<,in,not in)value: operand to compare with vcf
- count if variant pass all sub-anchors.
- for input file, only .vcf will be accepted.
- configure the anchors.json before you run the program, and make sure the key of each sub-anchor appear in your vcf annotation.
- multi-threads is for multiple vcfs.
total: # of total variantspass_anchors: # of variants that pass all sub-anchors in an anchor
INFO [vcf] : ['sample1.hg19_multianno.vcf', 'sample2.hg19_multianno.vcf']
INFO [anchors file] : [anchors/anchors-basic.json]
INFO [anchors] : ['anchors-PASS', 'anchors-AF<0.01', 'anchors-AF<0.05', 'anchors-PASS&AF<0.01', 'anchors-PASS&AF<0.05']
INFO [write2file] : [False]
INFO [threads] : [1]
INFO [anchors-PASS]-[sample1.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [PASS] : 0
INFO
INFO [anchors-PASS]-[sample2.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [PASS] : 0
INFO
INFO [anchors-AF<0.01]-[sample1.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [AF<0.01] : 0
INFO
INFO [anchors-AF<0.01]-[sample2.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [AF<0.01] : 0
INFO
INFO [anchors-AF<0.05]-[sample1.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [AF<0.05] : 0
INFO
INFO [anchors-AF<0.05]-[sample2.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [AF<0.05] : 0
INFO
INFO [anchors-PASS&AF<0.01]-[sample1.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [PASS] : 0
INFO [AF<0.01] : 0
INFO
INFO [anchors-PASS&AF<0.01]-[sample2.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [PASS] : 0
INFO [AF<0.01] : 0
INFO
INFO [anchors-PASS&AF<0.05]-[sample1.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [PASS] : 0
INFO [AF<0.05] : 0
INFO
INFO [anchors-PASS&AF<0.05]-[sample2.hg19_multianno.vcf] (time used: 0.00 min)
INFO [total] : 20
INFO [pass_anchors] : 0
INFO [PASS] : 0
INFO [AF<0.05] : 0
INFO