Clumping and pairwise LD for specified SNP lists #24

explodecomputer · 2018-09-24T09:34:13Z

I was alerted to your package recently and it looks extremely valuable, congratulations!

I did have a couple of feature requests, apologies if this is already implemented I didn't see the documentation.

Clumping - where SNPs are ordered based on their p-value in GWAS and are iteratively filtered by removing any SNPs in LD with the SNP with the lowest p-value
Creating an LD matrix for a list of SNPs (e.g. rather than a region)

mklarqvist · 2018-09-25T09:00:05Z

Thanks for these suggestions @explodecomputer . These features are not yet implemented in Tomahawk.

There is a big update coming to Tomahawk in the next few weeks and I will be sure to implement your suggestions.

explodecomputer · 2020-01-17T12:55:29Z

Hi @mklarqvist I just wanted to follow up on this. We have a service that performs LD calculations on the fly, currently using plink 1.9. This is the service: https://gwas-api.mrcieu.ac.uk/
The order of operations is typically pretty small. There are say 5000 SNPs which reach genome-wide significance, and we need to clump them, meaning

Rank the p-values from lowest to highest
Any SNP that is in LD at some threshold and within a physical distance window with the top hit is removed
return to (2) with the new remaining top hit
Once no more SNPs left, return each independent top hit

It's quite a simple algorithm, and plink 1.9 provides good performance on the LD reference panel that we're using which is ~500 european individuals from the 1000 genomes data, retaining only SNPs with maf > 0.01

Running clumping on say 2000 SNPs in this reference dataset in plink takes around 5 seconds.

The next thing that we want to do is increase the sample size of this reference dataset so that more precise estimates of LD can be obtained. Tomahawk looks like a potentially good choice, but I just wanted to get your advice on this before I explore further.

If clumping isn't implemented I'm happy to try implementing it in a fork and create a pull request.
Also - do you have plans to allow indels to be included?
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clumping and pairwise LD for specified SNP lists #24

Clumping and pairwise LD for specified SNP lists #24

explodecomputer commented Sep 24, 2018

mklarqvist commented Sep 25, 2018

explodecomputer commented Jan 17, 2020

Clumping and pairwise LD for specified SNP lists #24

Clumping and pairwise LD for specified SNP lists #24

Comments

explodecomputer commented Sep 24, 2018

mklarqvist commented Sep 25, 2018

explodecomputer commented Jan 17, 2020