Flanders : Finemapping coLocalization AND plEiotRopy Solver

Flanders is a modular pipeline and toolkit for scalable fine-mapping and colocalization of genetic association signals across large scale datasets and multiple traits. Implemented using Nextflow and mostly R, it separates computationally intensive fine-mapping from downstream colocalization to optimize reusability and performance.

✅ Requirements

Before running the pipeline, ensure you have the following installed:

Nextflow (v24.04+)
For environment management, one of:
- Docker
- Singularity
- Conda

▶️ Running the pipeline

Example: Fine-Mapping + Colocalization

nextflow run Biostatistics-Unit-HT/Flanders -r 1.0    -profile [docker|singularity|conda]    --summarystats_input /path/to/input_table.tsv    --run_colocalization true    --finemap_id my_finemap_run    --coloc_id my_coloc_run    -w ./work    -resume

Example: Run Only Colocalization (with existing `.h5ad`)

nextflow run Biostatistics-Unit-HT/Flanders -r 1.0    -profile [docker|singularity|conda]    --coloc_h5ad_input /path/to/finemapping_output.h5ad    --run_colocalization true    --coloc_id my_coloc_run    -w ./work    -resume

Quick run with example dataset

nextflow run Biostatistics-Unit-HT/Flanders -r 1.0 -profile test,singularity -w ./work

🧠 Pipeline overview

Flanders separates the fine-mapping and colocalization process into two distinct steps:

Step 1: Fine-mapping

Required Inputs

Input	Description
GWAS summary statistics	`.tsv`/`.csv` (optionally gzipped)
LD reference panel	PLINK-format reference panel (`.bed/.bim/.fam`) — preferably from the same sample population used in the GWAS
Metadata and GWAS-specific parameters table	`.tsv` file listing GWAS summary statistics paths and trait-specific parameters

Steps

Munging of GWAS summary statistics

Format harmonization and imputation of missing information (e.g. missing allele frequency calculated from the LD reference panel)
Optional liftover to GRCh38
Optional restriction of analysis to enlisted chromosomes
⚠️ In addition to autosomes, chromosomes X and Y are also accepted.
Alphabetical ordering of alleles, ensuring the first one in alphabetical order is the effect allele (effect sizes and allele frequencies are flipped/inverted where needed)
Conversion of SNP IDs to Flanders internal coding of "chr"CHR:POS:EA:NEA where EA is the first allele in alphabetical order
⚠️ This differs from common REF/ALT conventions and allows for robust variants matching between multiple GWAS summary statistics and LD reference panel.

2. Identification of significantly associated genomic regions

Identifies genomic regions containing significant associated SNPs by employing Locusbreaker, an in-house developed algorithm which defines each association peak based on the distance between the end of a peak and the start of the next one.

Locusbreaker first selects all SNPs below a given a p-value threshold (suggested value 1x10^-6, customizable at the column p_thresh2 of the Metadata and GWAS-specific parameters table), identifying groups of SNPs positionally close to each other.
If two consecutive SNPs are closer to each other than a set distance threshold (suggested value 250kb, customizable at the column hole of the Metadata and GWAS-specific parameters table), they are grouped into the same locus, while if they are further apart than the distance threshold, they are used to define the boundaries between peaks.
Loci with at least a significant SNPs (suggested value 5x10^-8, customizable at the column p_thresh1 of the Metadata and GWAS-specific parameters table) are retained and their boundaries are enlarged by 100kb to fully capture the shape of the association peak.

3. Fine-Mapping with SuSiE-RSS

For each genomic region, finemapping is performed using SuSiE-RSS and LD calculated from input PLINK files
⚠️ Whenever possible, in sample LD is strongly recommended (especially for molecular omic phenotypes where the explained variance can be very large).
⚠️ Be aware that only SNPs in common between the GWAS summary statistics and the LD reference panel are taken into account for fine-mapping, while all other SNPs are discarded (loci for which no SNP overlap is found between the GWAS summary statistics and the LD reference panel are reported in NOT_FINEMAPPED_no_variants_from_locus_in_LD_ref.tsv).
⚠️ Be aware that loci fully or partially overlapping the HLA region (GRCh38: chr6:28,510,120-33,480,577) are excluded from fine-mapping. The HLA region is characterized by extremely high variant density, long-range linkage disequilibrium and complex haplotype patterns, which can bias statistical fine-mapping methods and reduce confidence in inferred causal variants.

4. Saving fine-mapping results to AnnData object

Log approximate Bayes factors (lABFs) and metadata for the 99% credible sets are stored in an AnnData object (.h5ad).

Step 2: Colocalization analysis

Inputs

Input	File description
Fine-mapping AnnData	An `.h5ad` file containing lABFs and metadata of credible sets (output from the fine-mapping step)

Steps

1. Generation of colocalization guide table

Lists all pairs of credible sets that share at least one SNP (it is not possible for credible sets to colocalize without sharing at least a SNP).
⚠️ If no credible sets share at least one SNP, no colocalization is performed and an empty guide table is produced.

2. Colocalization with iCOLOC

Performs pair-wise colocalization for pair of credible sets listed in the guide table by employing iCOLOC, a framework extending traditional colocalization analysis using Bayes Factors by imputing lABFs of SNPs outside of credible sets to the minimum lABF value in the locus.

iCOLOC approach allows to:

Significantly reducing storage requirements by saving in the AnnData object only exact lABF values of credible sets SNPs
Enhancing colocalization accuracy compared to tradional coloc by reducing false positives due to two causal SNPs being in strong LD.

📁 Output

Output Type	Description
`gwas_and_loci_tables/*_dataset_aligned.tsv.gz`	Harmonized (and optionally lifted) GWAS summary statistics
`gwas_and_loci_tables/*_loci.tsv`	Boundaries of identified association regions and GWAS summary statistics for the sentinel SNP
`finemapping_exceptions/`	Multiple tables reporting information about loci that were not fine-mapped with the standard procedure or at all
`finemapping/_susie_finemap.rds` (optional)*	Individual RDS files for each fine-mapped locus
`anndata/*.h5ad`	AnnData object with lABFs, CS metadata and SNP annotations resulting from fine-mapping
`coloc/coloc_guide_table.csv`	Colocalization analysis guide table, listing all colocalization tests performed
`coloc/_colocalization.table..tsv`	Colocalization analysis results (all, filterd by PPH4 threshold and filtered by PPH3 threshold)

👩‍🔬 Credits

Developed by the Biostatistics and Genome Analysis Units at Human Technopole

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
assets		assets
bin		bin
conf		conf
example_data		example_data
lib		lib
modules/local		modules/local
workflows		workflows
.gitignore		.gitignore
AL_ESHG_2025_poster.pdf		AL_ESHG_2025_poster.pdf
Dockerfile		Dockerfile
README.md		README.md
conda-lock.yml		conda-lock.yml
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
pipeline_environment.yml		pipeline_environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flanders : Finemapping coLocalization AND plEiotRopy Solver

✅ Requirements

▶️ Running the pipeline

Example: Fine-Mapping + Colocalization

Example: Run Only Colocalization (with existing `.h5ad`)

Quick run with example dataset

🧠 Pipeline overview

Step 1: Fine-mapping

Required Inputs

Steps

2. Identification of significantly associated genomic regions

3. Fine-Mapping with SuSiE-RSS

4. Saving fine-mapping results to AnnData object

Step 2: Colocalization analysis

Inputs

Steps

1. Generation of colocalization guide table

2. Colocalization with iCOLOC

📁 Output

👩‍🔬 Credits

About

Uh oh!

Releases

Packages

Languages

wtsi-hgi/Flanders

Folders and files

Latest commit

History

Repository files navigation

Flanders : Finemapping coLocalization AND plEiotRopy Solver

✅ Requirements

▶️ Running the pipeline

Example: Fine-Mapping + Colocalization

Example: Run Only Colocalization (with existing .h5ad)

Quick run with example dataset

🧠 Pipeline overview

Step 1: Fine-mapping

Required Inputs

Steps

2. Identification of significantly associated genomic regions

3. Fine-Mapping with SuSiE-RSS

4. Saving fine-mapping results to AnnData object

Step 2: Colocalization analysis

Inputs

Steps

1. Generation of colocalization guide table

2. Colocalization with iCOLOC

📁 Output

👩‍🔬 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Example: Run Only Colocalization (with existing `.h5ad`)

Packages