-
Notifications
You must be signed in to change notification settings - Fork 1
14. Heterozygosity Calculation
George Pacheco edited this page Aug 4, 2021
·
1 revision
Based on
Dataset I
and using ANGSD--v0.931, we calculate the percentage of heterozygous genotypes of each sample.
zcat ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.mafs.gz | cut -f1,2 | tail -n +2 | awk '{print $1"\t"$2-1"\t"$2}' | bedtools merge -i - > ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.bed
awk '{print $1"\t"($2+1)"\t"$3}' ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.bed > ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.pos
angsd sites index ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.pos
parallel --plus --dryrun angsd -i {} -anc ~/data/Pigeons/Reference/DanishTumbler_Dovetail_ReRun.fasta -ref ~/data/Pigeons/Reference/DanishTumbler_Dovetail_ReRun.fasta -sites ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.pos -rf ~/data/Pigeons/Reference/DanishTumbler_Dovetail_ReRun_ChrGreater1kb.id -GL 1 -doSaf 1 -fold 1 -remove_bads 1 -uniqueOnly 1 -baq 1 -C 50 -minMapQ 30 -minQ 20 -out ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/{/...} :::: ~/data/Pigeons/PBGP/PBGP--Analyses/Lists/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.list | xsbatch -R --max-array-jobs 60 -c 1 --time 10-00 --mem-per-cpu 6024 -J HetCalc --
parallel --plus "realSFS {} > ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/{/..}.het" ::: ~/data/Pigeons/PBGP/PBGP--Analyses/Miscellaneous/HeterozygosityCalc/*.saf.idx
fgrep '.' *.het | tr ":" " " | awk '{print $1"\t"$3/($2+$3)*100}' | gawk '{match($1,/(GBS|WGS|WGS\-GBS)/,lol);print $1"\t"$2"\t"lol[1]}' | sort -k 1,1gr | awk '{split($0,a,"_"); print $1"\t"a[1]"\t"$2"\t"$3'} > ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--Miscellaneous/HeterozygosityCalc/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.Heterozygosity.txt
- 1. Data Access
- 2. Sequencing Quality Check
- 3. Demultiplexing
- 4. Creation of Mapping Targets
- 5. Filtering For Chimeric Reads
- 6. GBS Sexing
- 7. Read Processing & Mapping
- 8. Running Stats & Filtering of Bad Samples
- 9. Filtering of Possible Paralogs
- 10. Merging of Duplicate Cases
- 11. Investigation of Filtering of Possible Paralogs
- 12. Creation of Specific Datasets
- 13. Loci Information
- 14. Heterozygosity Calculation
- 15. Population Genetics Statistics
- 16. Phylogenetic Reconstruction
- 17. Multidimensional Scaling
- 18. Estimation of Individual Ancestries
- 19. Inference of Population Splits
- 20. Measuring of Linkage Disequilibrium
- 21. GWAS