-
Notifications
You must be signed in to change notification settings - Fork 1
3. Real example for GSUB
Here we provide a concrete example to run GSUB. We use Alzheimer's disease GWAS summary statistics from Kunkle et al. 2019 and Alzheimer's disease proxy GWAS (GWAX) in the UK Biobank as inputs. The main goal is to estimate genetic associations with the non-disease factor (Fnon) underlying parental disease history.

Ensure you've installed R and downloaded the reference SNP file as described in 1. Preparation for GSUB
Make a folder for the example data:
mkdir example
cd example
Get Kunkle et al. 2019 AD GWAS data:
wget ftp://ftp.biostat.wisc.edu/pub/lu_group/Projects/GSUB/ref/Kunkle_etal_Stage1_results.txt.gz
Get AD proxy GWAS (GWAX) data from the UK Biobank:
wget ftp://ftp.biostat.wisc.edu/pub/lu_group/Projects/GSUB/ref/ParentalAD.annotated.regenie.logistic.gz
Go back into the GSUB directory:
cd ..
Make a folder for the output:
mkdir output
Run GSUB:
NOTE: Ensure your sumstat_files are ordered correctly (ordering for this example is AD, AD family history. Here we're regressing out AD from AD family history)
Rscript GSUB.R \
--sumstat_files ./example/Kunkle_etal_Stage1_results.txt.gz,./example/ParentalAD.annotated.regenie.logistic.gz \
--output_path ./output \
--correction standard \
--N NA,NA \
--info.filter 0.9 \
--maf.filter 0.01 \
--sample.prev 0.5,0.5 \
--population.prev 0.05,0.05 \
--se.logit TRUE,TRUE \
--OLS FALSE,FALSE \
--linprob FALSE,FALSE \
--keep.indel FALSE
Your final output file will be written to ./output/gsub_analytical_all.txt.gz. You can decompress this file by running:
gzip -d ./output/gsub_analytical_all.txt.gz
The first lines of the results look like this:
head ./output/gsub_analytical_all.txt
SNP CHR BP MAF A1 A2 beta.1 se.1 beta.2 se.2 lambda11 lambda12 lambda22 gamma1 se.gamma1 z.gamma1 p.gamma1 gamma2 se.gamma2 z.gamma2 p.gamma2 n.gamma1 n.gamma2
rs2073813 1 753541 0.12326 G A 0.0273988230917198 0.0167269369043677 -0.00814744556061092 0.0121313760917577 0.272401142013132 0.247074016559996 0.0923095237748176 0.100582629313643 0.0614055314920866 1.63800600482718 0.101420441152563 -0.357479904909877 0.209771076607855 -1.70414296713626 0.0883543810972295 2259 143
rs3131969 1 754182 0.128231 A G -0.021114844271081 0.0158293416307772 0.00885892881217973 0.0120786084939729 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0775137876259823 0.0581104084725682 -1.33390539945307 0.182234861794909 0.303441839089534 0.202605128584048 1.49770068117331 0.134211034799065 2259 143
rs3131968 1 754192 0.128231 A G -0.0210597196031535 0.0158293457450882 0.00876749885596263 0.012083623743108 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0773114218520357 0.0581104235764369 -1.33042261773126 0.183379066581056 0.301909718946163 0.202639981332277 1.48988228759807 0.136255189577119 2259 143
rs3131967 1 754334 0.128231 T C -0.0244222415420914 0.0164465212864346 0.00892911112528705 0.0120798600659029 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0896554300822796 0.0603761098976662 -1.48494876921087 0.13755739564583 0.336700234939449 0.207280687365238 1.62436857586336 0.10429716953725 2259 143
rs3115858 1 755890 0.127237 A T -0.0233749267615418 0.0157169064689343 0.00987063083974022 0.0120057795272785 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0858106782842158 0.0576976526338374 -1.48724730326189 0.13694950679366 0.33660903573674 0.201256068348843 1.67254104931279 0.0944176803599004 2259 143
rs3131962 1 756604 0.130219 A G -0.0181931714710897 0.0149875404900784 0.0088865044922004 0.0119693456761731 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0667881615203091 0.0550201088707472 -1.21388639337678 0.224791109487728 0.275032551133222 0.195581670782613 1.40622866157493 0.159656235843049 2259 143
rs6699990 1 756912 0.0238569 A G 0.0766242375784983 0.0651640729026957 0.0088865044922004 0.0119693456761731 0.272401142013132 0.247074016559996 0.0923095237748176 0.281291910203535 0.239220997463933 1.17586630431949 0.239648305686455 -0.656632329026881 0.65246566761244 -1.0063860240029 0.314229913967701 NA NA
rs3115853 1 757640 0.129225 G A -0.0238710104132891 0.016446563317669 0.00999843486818235 0.0119661269475782 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0876318294294756 0.060376264196705 -1.45142848096683 0.146660583797552 0.342868012416911 0.206511631861101 1.66028426257134 0.0968572801449506 2259 143
rs4951929 1 757734 0.127237 C T -0.0205636021088313 0.0157171220859619 0.00912605042453765 0.0119808471082676 0.272401142013132 0.247074016559996 0.0923095237748176 -0.0754901464687692 0.0576984441761415 -1.30835670782237 0.190752349230356 0.300919157497124 0.20108459284654 1.49648042765153 0.134528504562711 2259 143
The output columns:
-
beta.1, se.1: the effect size and standard error from the 1st input summary statistics. -
beta.2, se.2: the effect size and standard error from the 2nd input summary statistics. -
gamma1, se.gamma1, z.gamma1, p.gamma1: the summary statistics for$\text{F}_\text{AD}$ . This is essentially the summary statistics of the 1st input. -
gamma2, se.gamma2, z.gamma2, p.gamma2: the summary statistics for$\text{F}_\text{non}$ . This is what we're interested in. -
n.gamma1, n.gamma2: the effective sample sizes. They were computed according to GenomicSEM recommendations.