-
Notifications
You must be signed in to change notification settings - Fork 267
Open
Description
Could anyone kindly explain the differences for below 3 bcftools setGT commands? Expectedly, number of "Filled" alleles numbers change significantly, we are not sure which one is actually doing what we are aiming to do.
The intended QC is to change to no call variants called as heterozygotes, but which have a ratio of ref-alt (or vice versa) reads which is more skewed than 3:1, as they're more likely to be artefacts.
Where we're tripping up is whether we should be using & or && and | or || during that step (although we think it should be & and |)
Input is a test vcf for a single chromosome with 100 samples and 98000 variants, it's germline WES data.
# 1. using && and ||
bcftools +setGT tmp.100.qc2.vcf.gz \
-- -t q -n . \
-i 'GT="het" && (FMT/AD[*:0] + FMT/AD[*:1] > 0) && (
FMT/AD[*:1] / (FMT/AD[*:0] + FMT/AD[*:1]) <= 0.25 ||
FMT/AD[*:1] / (FMT/AD[*:0] + FMT/AD[*:1]) >= 0.75
)'
#Filled 686012 alleles
# 2. using & and ||
bcftools +setGT tmp.100.qc2.vcf.gz \
-- -t q -n . \
-i 'GT="het" & (FMT/AD[*:0] + FMT/AD[*:1] > 0) & (
FMT/AD[*:1] / (FMT/AD[*:0] + FMT/AD[*:1]) <= 0.25 ||
FMT/AD[*:1] / (FMT/AD[*:0] + FMT/AD[*:1]) >= 0.75
)'
#Filled 94496 alleles
# 3. using & and |
bcftools +setGT tmp.100.qc2.vcf.gz \
-- -t q -n . \
-i 'GT="het" & (FMT/AD[*:0] + FMT/AD[*:1] > 0) & (
FMT/AD[*:1] / (FMT/AD[*:0] + FMT/AD[*:1]) <= 0.25 |
FMT/AD[*:1] / (FMT/AD[*:0] + FMT/AD[*:1]) >= 0.75
)'
#Filled 4410 allelesReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels