CAMPneu

Comprehensive Analysis of Mycoplasma Pneumoniae

CAMPneu is a Nextflow bioinformatic pipeline that is reproducible, scalable, and suitable for a wide range of computation environments. While extensible, early drafts of CAMPneu are targeted for Illumina paired-end sequence data with the objectives of

Determining if the specimen belongs to the M. pneumoniae species
Classification of the subtype (type1 or type2) of M. pneumoniae
Identification of known SNPs conferring macrolide-resistance present within the sample

System Requirements:

CAMPneu requires systems to have the following installed/available:

Conda
Singularity
Nextflow (to be used with the singularity profile)

CAMPneu is designed to work with both Conda and Singularity container, offering flexibility and reproducibility in computational environments.

CONDA:

Conda excels at managing dependencies and creating isolated environments. Conda is also easy to use across different operating systems and is ideal for setting up reproducible environments on local machines.

Installation using Conda

conda install -n campneu -c bioconda -c conda-forge -c appliedbinf campneu 
conda activate campneu

Run command

CAMPneu.nf --input <fastq_reads_dir> --output <output_dir> -profile conda

Help message

CAMPneu.nf --help

SINGULARITY:

Singularity ensures consistency and portability across systems and is tailored for high-performance computing (HPC) environments offering enhanced efficiency.

The conda installed version of CAMPneu can also be run using singularity but if the user does not have access to conda, they can clone the git repository (nextflow is required for this approach).

Git installation

git clone https://github.com/appliedbinf/CAMPneu.git

Run the program from the project repository:

nextflow run CAMPneu.nf --input <fastq_reads_dir> --output <output_dir> -profile singularity

Help message

nextflow run CAMPneu.nf --help

Script Input Requirements

Required arguments:

  --input     Path to the Paired Fastq Reads directory  
  --output    Directory where process outputs are saved

Optional arguments:

  --snpFile   Path to the custom SNP bed file
  --help      Print this message and exit

NextFlow script step-by-step workflow:

Kraken2 Taxonomic Classification: Classifies input sequences based on a pre-built database.
Quality Control with Fastp: Profiles and filters reads to ensure high-quality data.
Coverage Assessment with Samtools: Calculates mean depth to evaluate sequencing coverage.
De Novo Assembly with Unicycler: Reconstructs microbial genomes without a reference.
ANI Calculation: Determines the best match by comparing the assembled genomes to reference genomes.
Alignment with Minimap2: Aligns reads to the best-matched reference genome.
Variant Calling with FreeBayes: Identifies SNPs and genetic variations against a type 1 reference.
Macrolide-Resistant SNP Identification: Detects SNPs associated with macrolide resistance

Cut Off Thresholds

The pipeline sets specific thresholds for input paired reads/samples. Any reads or samples that do not meet these thresholds are marked as failed.

Kraken2 Percentage of Reads assigned to M. Pneumoniea > 90
Average Q score > 30
Coverage > 10x
ANI to reference > 95
SNP call quality > 100; Depth > 10

Required inputs:

Illumina paired-end sequences
23SsnpAnalysis.py: Python script for VCF manipulation and analysis

Outputs:

The scripts generates output directories for each process which have the files generated in the process

Process Outputs:

Kraken: kraken reports and kraken summaries for all the paired end reads
fastp: fastp reports and quality filtered paired end reads
Coverage_check: samtools coverage report and coverage filtered paired end reads
assembly: assembled fasta of the QC filtered samples and empty fasta of the failed samples
fastANI: fastANI report
bestReference: fastANI report with only the subtyped reference for the sample

Summary

Sample_reports: Reports for each sample summarizing QC and type information
Summary: Report for the entire run w=summarizing which samples have Passed or failed the QC and the SNPs identified for macrolide resistance

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
conda-recipe		conda-recipe
CAMPneu.nf		CAMPneu.nf
LICENSE		LICENSE
README.md		README.md
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CAMPneu

NextFlow script step-by-step workflow:

Cut Off Thresholds

Required inputs:

Outputs:

Process Outputs:

Summary

About

Releases 2

Packages

Contributors 2

Languages

License

appliedbinf/CAMPneu

Folders and files

Latest commit

History

Repository files navigation

CAMPneu

NextFlow script step-by-step workflow:

Cut Off Thresholds

Required inputs:

Outputs:

Process Outputs:

Summary

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages