Skip to content

5. Estimating functional enrichment using S LDSC

Omer Weissbrod edited this page Jan 29, 2022 · 2 revisions

PolyFun is not directly suitable to estimate functional enrichment because it uses L2-regularization, which reduces estimation variance but adds bias. However, the PolyFun code base includes a fully functional version of stratified LD-score regression (S-LDSC) (upgraded to python 3) that you can use to estimate functional enrichment. S-LDSC and PolyFun use the same input files. To estimate functional enrichment from the polyfun code base, please type:

python ldsc.py \
--h2 <sumstats_file> \
--ref-ld-chr <prefix to LD-score files> \
--w-ld-chr <prefix to SNP weights files> \
--out <output_file>
--overlap-annot \
--not-M-5-50 \

The argument --not-M-5-50 tells S-LDSC to estimate functional enrichment of the heritability causally explained by all SNPs, rather than only common SNPs (MAF>5%). If you omit this flag, you need to instead specify MAF files by adding the --frqfile-chr argument. Please see the S-LDSC wiki for details.

Here is a use example you can try:

mkdir -p temp

python ldsc.py \
--h2 example_data/sumstats.parquet \
--ref-ld-chr example_data/annotations. \
--w-ld-chr example_data/weights. \
--out temp/enrichment \
--overlap-annot \
--not-M-5-50

To see the output please type cat temp/enrichment.results. It should look like this:

Category                         Prop._SNPs  Prop._h2     Prop._h2_std_error  Enrichment   Enrichment_std_error  Enrichment_p
Coding_UCSC_common_0             5.2671e-03  3.0045e-01   4.7188e-02          5.7043e+01   8.9590e+00            3.1728e-09
Coding_UCSC_lowfreq_0            1.1581e-02  -3.3554e-03  4.0509e-02          -2.8973e-01  3.4979e+00            7.0952e-01
Conserved_LindbladToh_common_0   8.8296e-03  4.3129e-01   4.7615e-02          4.8846e+01   5.3927e+00            3.0678e-16
Conserved_LindbladToh_lowfreq_0  2.0367e-02  5.5681e-02   5.4985e-02          2.7339e+00   2.6997e+00            5.2303e-01
Repressed_Hoffman_common_0       1.7496e-01  -9.9906e-03  3.2406e-02          -5.7102e-02  1.8522e-01            1.9314e-07
Repressed_Hoffman_lowfreq_0      2.7381e-01  -4.0496e-01  5.7435e-02          -1.4790e+00  2.0977e-01            6.3585e-23
base_0                           1.0000e+00  1.0000e+00   5.4703e-09          1.0000e+00   5.4703e-09            NA