Skip to content

Commit

Permalink
Deployed b792654 with MkDocs version: 1.5.3
Browse files Browse the repository at this point in the history
  • Loading branch information
Cloufield committed Oct 29, 2024
1 parent b119d09 commit 471e936
Show file tree
Hide file tree
Showing 11 changed files with 334 additions and 127 deletions.
20 changes: 10 additions & 10 deletions 03_Data_formats/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2735,7 +2735,7 @@ <h3 id="ped-map">ped / map</h3>
<div class="admonition example">
<p class="admonition-title"><code>.ped</code></p>
<div class="highlight"><pre><span></span><code># check the first 16 rows and 16 columns of the ped file
cut -d &quot; &quot; -f 1-16 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.ped | head
cut -d &quot; &quot; -f 1-16 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.ped | head
0 HG00403 0 0 0 -9 G G T T A A G A C C
0 HG00404 0 0 0 -9 G G T T A A G A T C
0 HG00406 0 0 0 -9 G G T T A A G A T C
Expand Down Expand Up @@ -2764,7 +2764,7 @@ <h3 id="ped-map">ped / map</h3>
</ul>
<div class="admonition example">
<p class="admonition-title"><code>.map</code></p>
<div class="highlight"><pre><span></span><code>head 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.map
<div class="highlight"><pre><span></span><code>head 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.map
1 1:13273:G:C 0 13273
1 1:14599:T:A 0 14599
1 1:14604:A:G 0 14604
Expand All @@ -2781,15 +2781,15 @@ <h3 id="ped-map">ped / map</h3>
<h3 id="bed-fam-bim">bed / fam /bim</h3>
<p>bed/fam/bim formats are the binary implementation of ped/map formats.
bed/bim/fam files contain the same information as ped/map but are much smaller in size.</p>
<div class="highlight"><pre><span></span><code>-rw-r----- 1 yunye yunye 135M Dec 23 11:45 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.bed
-rw-r----- 1 yunye yunye 36M Dec 23 11:46 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.bim
-rw-r----- 1 yunye yunye 9.4K Dec 23 11:46 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.fam
-rw-r--r-- 1 yunye yunye 32M Dec 27 17:51 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.map
-rw-r--r-- 1 yunye yunye 2.2G Dec 27 17:51 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.ped
<div class="highlight"><pre><span></span><code>-rw-r----- 1 yunye yunye 135M Dec 23 11:45 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.bed
-rw-r----- 1 yunye yunye 36M Dec 23 11:46 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.bim
-rw-r----- 1 yunye yunye 9.4K Dec 23 11:46 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.fam
-rw-r--r-- 1 yunye yunye 32M Dec 27 17:51 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.map
-rw-r--r-- 1 yunye yunye 2.2G Dec 27 17:51 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.ped
</code></pre></div>
<div class="admonition example">
<p class="admonition-title"><code>.fam</code></p>
<div class="highlight"><pre><span></span><code>head 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.fam
<div class="highlight"><pre><span></span><code>head 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.fam
0 HG00403 0 0 0 -9
0 HG00404 0 0 0 -9
0 HG00406 0 0 0 -9
Expand All @@ -2804,7 +2804,7 @@ <h3 id="bed-fam-bim">bed / fam /bim</h3>
</div>
<div class="admonition example">
<p class="admonition-title"><code>.bim</code></p>
<div class="highlight"><pre><span></span><code>head 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.bim
<div class="highlight"><pre><span></span><code>head 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.bim
1 1:13273:G:C 0 13273 C G
1 1:14599:T:A 0 14599 A T
1 1:14604:A:G 0 14604 G A
Expand All @@ -2822,7 +2822,7 @@ <h3 id="bed-fam-bim">bed / fam /bim</h3>
<p>"Primary representation of genotype calls at biallelic variants
The first three bytes should be 0x6c, 0x1b, and 0x01 in that order.
The rest of the file is a sequence of V blocks of N/4 (rounded up) bytes each, where V is the number of variants and N is the number of samples. The first block corresponds to the first marker in the .bim file, etc."
<div class="highlight"><pre><span></span><code>hexdump -C 1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020.bed | head
<div class="highlight"><pre><span></span><code>hexdump -C 1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing.bed | head
00000000 6c 1b 01 ff ff bf bf ff ff ff ef fb ff ff ff fe |l...............|
00000010 ff ff ff ff fb ff bb ff ff fb af ff ff fe fb ff |................|
00000020 ff ff ff fe ff ff ff ff ff bf ff ff ef ff ff ef |................|
Expand Down
103 changes: 62 additions & 41 deletions 04_Data_QC/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3017,47 +3017,68 @@ <h3 id="hardy-weinberg-equilibrium-exact-test">Hardy-Weinberg equilibrium exact
<div class="admonition example">
<p class="admonition-title">Calculate the Hardy-Weinberg equilibrium exact test statistics for a single SNP using Python</p>
<p>This code is converted from <a href="https://github.com/jeremymcrae/snphwe/blob/master/src/snp_hwe.cpp">here</a> (Jeremy McRae) to python. Orginal citation: Wigginton, JE, Cutler, DJ, and Abecasis, GR (2005) A Note on Exact Tests of Hardy-Weinberg Equilibrium. AJHG 76: 887-893
<div class="highlight"><pre><span></span><code>def snphwe(obs_hets, obs_hom1, obs_hom2):
obs_homr = min(obs_hom1, obs_hom2)
obs_homc = max(obs_hom1, obs_hom2)

rare = 2 * obs_homr + obs_hets
genotypes = obs_hets + obs_homc + obs_homr

probs = [0.0 for i in range(rare +1)]

mid = rare * (2 * genotypes - rare) // (2 * genotypes)
if mid % 2 != rare%2:
mid += 1

probs[mid] = 1.0
sum_p = 1 #probs[mid]

curr_homr = (rare - mid) // 2
curr_homc = genotypes - mid - curr_homr

for curr_hets in range(mid, 1, -2):
probs[curr_hets - 2] = probs[curr_hets] * curr_hets * (curr_hets - 1.0)/ (4.0 * (curr_homr + 1.0) * (curr_homc + 1.0))
sum_p+= probs[curr_hets - 2]
curr_homr += 1
curr_homc += 1

curr_homr = (rare - mid) // 2
curr_homc = genotypes - mid - curr_homr

for curr_hets in range(mid, rare-1, 2):
probs[curr_hets + 2] = probs[curr_hets] * 4.0 * curr_homr * curr_homc/ ((curr_hets + 2.0) * (curr_hets + 1.0))
sum_p += probs[curr_hets + 2]
curr_homr -= 1
curr_homc -= 1

target = probs[obs_hets]
p_hwe = 0.0
for p in probs:
if p &lt;= target :
p_hwe += p / sum_p

return min(p_hwe,1)
<div class="highlight"><pre><span></span><code>def snphwe(obs_hets: int, obs_hom1: int, obs_hom2: int) -&gt; float:
&quot;&quot;&quot;Calculate Hardy-Weinberg equilibrium exact test for a variant

Args:
obs_hets (int): observed heterozygous number
obs_hom1 (int): first observed homozygous number
obs_hom2 (int): second observed homozygous number

Returns:
float: Hardy-Weinberg equilibrium exact test p-value.

- P &gt; 0.05: No significant deviation from Hardy-Weinberg equilibrium (HWE).
- P ≤ 0.05: Significant deviation from HWE, which could be due to factors such as population stratification, inbreeding, genotyping errors, or natural selection.
&quot;&quot;&quot;
obs_minor_homs = min(obs_hom1, obs_hom2)
obs_mijor_homs = max(obs_hom1, obs_hom2)

rare = 2 * obs_minor_homs + obs_hets
genotypes = obs_hets + obs_mijor_homs + obs_minor_homs

probs = [0.0 for i in range(rare +1)]

mid = rare * (2 * genotypes - rare) // (2 * genotypes)
if mid % 2 != rare%2:
mid += 1

probs[mid] = 1.0
sum_p = 1 #probs[mid]

curr_homr = (rare - mid) // 2
curr_homc = genotypes - mid - curr_homr

for curr_hets in range(mid, 1, -2):
probs[curr_hets - 2] = probs[curr_hets] * curr_hets * (curr_hets - 1.0)/ (4.0 * (curr_homr + 1.0) * (curr_homc + 1.0))
sum_p+= probs[curr_hets - 2]
curr_homr += 1
curr_homc += 1

curr_homr = (rare - mid) // 2
curr_homc = genotypes - mid - curr_homr

for curr_hets in range(mid, rare-1, 2):
probs[curr_hets + 2] = probs[curr_hets] * 4.0 * curr_homr * curr_homc/ ((curr_hets + 2.0) * (curr_hets + 1.0))
sum_p += probs[curr_hets + 2]
curr_homr -= 1
curr_homc -= 1

target = probs[obs_hets]
p_hwe = 0.0
for p in probs:
if p &lt;= target :
p_hwe += p / sum_p

return min(p_hwe,1)

if __name__ == &#39;__main__&#39;:
# For an example, we had 502 samples.
# At position 1:14930:A:G, we count:
# + 407 heterozygous genotypes (signed 0|1 or 1|0).
# + 4 homozygous genotypes AA (signed 0|0)
# + 91 homozygous genotypes GG (signed 1|1)
print(snphwe(407,4,91))
</code></pre></div></p>
</div>
<div class="admonition example">
Expand Down
4 changes: 2 additions & 2 deletions 05_PCA/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2508,8 +2508,8 @@ <h2 id="sample-codes">Sample codes</h2>
--bfile ${plinkFile} \
--threads ${threadnum} \
--read-freq ${outPrefix}.acount \
--score ${outPrefix}.eigenvec.allele 2 5 header-read no-mean-imputation variance-standardize \
--score-col-nums 6-15 \
--score ${outPrefix}.eigenvec.allele 2 6 header-read no-mean-imputation variance-standardize \
--score-col-nums 7-16 \
--out ${outPrefix}_projected
</code></pre></div>
</div>
Expand Down
2 changes: 1 addition & 1 deletion 07_Annotation/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2335,7 +2335,7 @@ <h3 id="format-input-file">Format input file</h3>
<p>We will only use the first 100000 variants as an example.</p>
<div class="admonition example">
<p class="admonition-title">annovar_input</p>
<div class="highlight"><pre><span></span><code>awk<span class="w"> </span><span class="s1">&#39;NR&gt;1 &amp;&amp; NR&lt;100000 {print $1,$2,$2,$4,$5}&#39;</span><span class="w"> </span>../06_Association_tests/1kgeas.B1.glm.logistic.<span class="w"> </span>hybrid<span class="w"> </span>&gt;<span class="w"> </span>annovar_input.txt
<div class="highlight"><pre><span></span><code>awk<span class="w"> </span><span class="s1">&#39;NR&gt;1 &amp;&amp; NR&lt;100000 {print $1,$2,$2,$4,$5}&#39;</span><span class="w"> </span>../06_Association_tests/1kgeas.B1.glm.firth<span class="w"> </span>&gt;<span class="w"> </span>annovar_input.txt
</code></pre></div>
<div class="highlight"><pre><span></span><code>head annovar_input.txt
1 13273 13273 G C
Expand Down
2 changes: 1 addition & 1 deletion 12_fine_mapping/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2367,7 +2367,7 @@ <h2 id="ld-matrix-calculation">LD Matrix Calculation</h2>
<p class="admonition-title">Example</p>
<p><div class="highlight"><pre><span></span><code>#!/bin/bash

plinkFile=&quot;../01_Dataset/1KG.EAS.auto.snp.norm.nodup.split.maf005.thinp020&quot;
plinkFile=&quot;../01_Dataset/1KG.EAS.auto.snp.norm.nodup.split.rare002.common015.missing&quot;

# LD r matrix
plink \
Expand Down
Loading

0 comments on commit 471e936

Please sign in to comment.