gwas-tools

This repository is a fork from hclimente/gwas-tools with for now only the GWAS analysis part to prepare its adaptation for another project from Chloé-Agathe Azencott's team at the CBIO

gwas-tools contains pipelines for common use-cases when dealing with GWAS datasets, from data preprocessing to biomarker discovery.

gwas-tools

Installation

gwas-tools core

Clone the repository, and add the bin folder to your path:

git clone [email protected]:kumquatum/gwas-tools.git
export PATH=$PATH:$PWD/gwas-tools/bin

Dependencies

To run the pipelines

Nextflow
Docker (optionnal)

Pipeline itself

Install all tools described below or build your own docker image based on the Dockerfile provided (some tools are under copyright and prevent us from providing a docker image) with :

# Being in gwas-tools folder
docker build -t <name_of_your_image> .

The docker image can then used in nextflow by adding the parameter -with-docker <name_of_your_image>.

Tool	License
BEDOPS	GPLv2
HotNet2	Copyright
IMPUTE	Copyright
PLINK 1.90	GPLv3
VEGAS2v02	GPLv3
R::biglasso	GPLv3
R::bigmemory	LGPLv3
R::CASMAP	GPLv2
R::BioNet	GPLv2
R::dmGWASv3	GPLv2
R::igraph	GPLv3
R::LEANR	GPLv3
R::martini	MIT
R::ranger	GPLv3
R::SigModv2	?
R::SKAT	GPLv3
R::snpStats	GPLv3

Test files

A partial minimal set of files is available in test/data to demonstrate the use of gwas-tools. For the SConES tool to function, the PPI file need to be downloaded and prepared as in bin/templates/dbs/biogrid.sh

Functions

Data preprocessing

SNP/association

With PLINK

vegas2.nf \
  --bfile test/data/example \
  --gencode 31 \
  --genome 37 \
  --buffer 50000 \
  --vegas_params '-top 10' \
  -with-docker <name_of_your_image>

With regenie

# Extraction of the phenotype from fam before use of regenie
format_conversion.nf \
  --file_to_convert test/data/example.fam \
  --conversion_type "fam2phenotype" \
  -with-docker <name_of_your_image>

vegas2_regenie.nf \
  --bfile test/data/example \
  --phenotype examplkke.tsv \
  --regenie_params_s1 "\-\-cc12 \-\-exclude test/data/snplist_rm.txt" \
  --regenie_params_s2 "\-\-cc12 \-\-exclude test/data/snplist_rm.txt" \
  --gencode 31 --genome 37 \
  --buffer 50000 \
  --vegas_params '-top 10' \
  -with-docker <name_of_your_image>

SNPs id to genes id mapping

Different references exists for gene id : Ensembl, HGNC (also known as gene symbol), entrez. Depending on your interaction file provided (protein protein interaction network or else), you will may have to convert your ids from one to another. This command generates a table with equivalences based on GENCODE and HGNC from the SNPs in a bim file.

snp2gene.nf \
  --bim test/data/example.bim \
  --genome GRCh38 \
  -with-docker <name_of_your_image>

It can then be used to convert ids from one reference to another depending on the one used by your interaction file. Example with the conversion of the VEGAS pipeline output to hgnc (can also be done to ensembl with vegas2ensembl):

format_conversion.nf \
  --file_to_convert scored_genes.vegas.txt \
  --conversion_type vegas2hgnc \
  --additional_file snp2hgnc.tsv

Note : the headers of the reference file need to have the 3 columns named snp,ensembl_gene_id, hgnc_gene_id if you provide another one than the one from snp2gene.nf pipeline

Network-guided GWAS

Multiple algorithms were adapted and benchmarked for the detection of SNPs associated to a phenotype. If you use any of the following algorithms, please cite the following article:

Climente-González H, Lonjou C, Lesueur F, GENESIS study group, Stoppa-Lyonnet D, et al. (2021) Boosting GWAS using biological networks: A study on susceptibility to familial breast cancer. PLOS Computational Biology 17(3): e1008819. https://doi.org/10.1371/journal.pcbi.1008819

Gene-based methods

dmGWAS:

dmgwas.nf \
  --vegas test/data/scored_genes.vegas.txt \
  --tab2 test/data/tab2 \
  -with-docker <name_of_your_image>

heinz:

heinz.nf \
  --vegas test/data/scored_genes.vegas.txt \
  --tab2 test/data/tab2 \
  --fdr 0.5 \
  -with-docker <name_of_your_image>

HotNet2:

hotnet2.nf \
  --scores test/data/scored_genes.vegas.txt \
  --tab2 test/data/tab2 \
  --hotnet2_path hotnet2 \
  --lfdr_cutoff 0.125 \
  -with-docker <name_of_your_image>

LEAN:

lean.nf \
  --vegas test/data/scored_genes.vegas.txt \
  --tab2 test/data/tab2 \
  -with-docker <name_of_your_image>

Sigmod:

# With docker
sigmod.nf \
  --vegas test/data/scored_genes.vegas.txt \
  --tab2 test/data/tab2 \
  -with-docker <name_of_your_image>
# Without docker
sigmod.nf \
  --sigmod <path_to_your_SigMod_v2_folder> \
  --vegas test/data/scored_genes.vegas.txt \
  --tab2 test/data/tab2

SNP based methods

SConES:

old_scones.nf \
  --bfile test/data/example \
  --network gi \
  --snp2gene test/data/snp2gene.tsv \
  --tab2 test/data/tab2 \
  -with-docker <name_of_your_image>

Troubleshooting

Usual mistakes :

Having the wrong number of - for a pipeline parameter :
- - is for nextflow parameters
- -- is for pipeline parameters

Name		Name	Last commit message	Last commit date
Latest commit History 299 Commits
bin		bin
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gwas-tools

Installation

gwas-tools core

Dependencies

To run the pipelines

Pipeline itself

Test files

Functions

Data preprocessing

SNP/association

SNPs id to genes id mapping

Network-guided GWAS

Gene-based methods

SNP based methods

Troubleshooting

About

Releases

Packages

Languages

License

Kumquatum/gwas-tools

Folders and files

Latest commit

History

Repository files navigation

gwas-tools

Installation

gwas-tools core

Dependencies

To run the pipelines

Pipeline itself

Test files

Functions

Data preprocessing

SNP/association

SNPs id to genes id mapping

Network-guided GWAS

Gene-based methods

SNP based methods

Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages