GitHub - genomic-medicine-sweden/nallo: An analysis pipeline for long-reads from both PacBio and Oxford Nanopore Technologies (ONT), written in Nextflow.

Introduction

genomic-medicine-sweden/nallo is a bioinformatics analysis pipeline for long-reads from both PacBio and (targeted) ONT-data, focused on rare-disease. Heavily influenced by best-practice pipelines such as nf-core/sarek, nf-core/raredisease, nf-core/nanoseq, PacBio Human WGS Workflow, epi2me-labs/wf-human-variation and brentp/rare-disease-wf.

Pipeline summary

QC

Read QC with FastQC, cramino, mosdepth and peddy

Alignment & assembly

Assemble genomes with hifiasm
Align reads and assemblies to reference with minimap2

Variant calling

Call SNVs & joint genotyping with deepvariant and GLNexus
Call SVs with Severus or Sniffles
Call CNVs with HiFiCNV
Call tandem repeats with TRGT (HiFi only) or STRdust
Call paralogous genes with Paraphase

Phasing and methylation

Phase and haplotag reads with LongPhase, whatshap or HiPhase
Create methylation pileups with modkit

Annotation

Annotate SNVs and INDELs with databases of choice, e.g. gnomAD, ClinVar, CADD with echtvar and VEP
Annotate repeat expansions with stranger (TRGT only)
Annotate SVs with SVDB and VEP

Ranking

Rank SNVs, INDELs, SVs and CNVs with GENMOD

Filtering

Filter SNVs, INDELs, SVs and CNVs with filter_vep and bcftools

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

Prepare a samplesheet with input data:

samplesheet.csv

project,sample,file,family_id,paternal_id,maternal_id,sex,phenotype
NIST,HG002,/path/to/HG002.fastq.gz,FAM1,HG003,HG004,1,2
NIST,HG005,/path/to/HG005.bam,FAM1,HG003,HG004,2,1

Supply a reference genome with --fasta and choose a matching --preset for your data (revio, pacbio, ONT_R10). Now, you can run the pipeline using:

nextflow run genomic-medicine-sweden/nallo \
    -profile <docker/singularity/.../institute> \
    --input samplesheet.csv \
    --preset <revio/pacbio/ONT_R10> \
    --fasta <reference.fasta> \
    --outdir <OUTDIR>

For more details and further functionality, please refer to the documentation.

Credits

genomic-medicine-sweden/nallo was originally written by Felix Lenner.

We thank the following people for their extensive assistance in the development of this pipeline: Anders Jemt, Annick Renevey, Daniel Schmitz, Lucía Peña-Pérez, Peter Pruisscher & Ramprasad Neethiraj.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

If you use genomic-medicine-sweden/nallo for your analysis, please cite it using the following doi: 10.5281/zenodo.13748210.

This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

An extensive list of references for the tools used by the pipeline can be found in the docs/CITATIONS.md file.

Name		Name	Last commit message	Last commit date
Latest commit History 670 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
assets		assets
bin		bin
conf		conf
docs		docs
modules		modules
subworkflows		subworkflows
tests		tests
workflows		workflows
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.nf-core.yml		.nf-core.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
.prettierrc.yml		.prettierrc.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
mkdocs.yml		mkdocs.yml
modules.json		modules.json
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
nf-test.config		nf-test.config
pyproject.toml		pyproject.toml
ro-crate-metadata.json		ro-crate-metadata.json
tower.yml		tower.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Pipeline summary

QC

Alignment & assembly

Variant calling

Phasing and methylation

Annotation

Ranking

Filtering

Usage

Credits

Contributions and Support

Citations

About

Releases 10

Packages

Contributors 11

Languages

License

genomic-medicine-sweden/nallo

Folders and files

Latest commit

History

Repository files navigation

Introduction

Pipeline summary

QC

Alignment & assembly

Variant calling

Phasing and methylation

Annotation

Ranking

Filtering

Usage

Credits

Contributions and Support

Citations

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 10

Packages 0

Contributors 11

Languages

Packages