Skip to content

AggressiveHayBale/anno_bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prokaryotic annotation tools benchmarking

This repository contains the code used in the following work:

TODO  [Insert publication link or details here]

Short Description

This Nextflow workflow was developed to benchmark the performance of four open-source prokaryotic genome annotation tools: Prokka, Bakta, EggNOG-mapper, and PGAP. We evaluated these tools on 180,504 diverse genomes to provide guidance on tool selection based on genome characteristics and various investigated metrics. For furter details please look at the publication.

Installation

Dependencies

Tools & Databases

The tools and their respective database will be pulled automaticaly.

Disclamer

As of last checked on July 8, 2025, the corresponding version of the PGAP database (https://s3.amazonaws.com/pgap/) is not anymore available for download.

Execution

To run the workflow, use the following command:

nextflow run  AggressiveHayBale/anno_bench  --csv sample_list.csv -profile local,docker

The sample_list.csv file should be formatted as follows:

Accession,Species,Path,Noise,Noise2,Noise3
GCA_000016605.1,Metallosphaera sedula,[path...]/GCA_000016605.1.fasta,0.005,0.01,0.02

Where:

  • Accession - NCBI GenBank Assembly Accession

  • Species - NCBI-compliant taxonomy (genus or species name).

  • Path - Absolute path to a FASTA genome file.

  • Noise - Percentage of introduced frameshifts.

    Results

The pipeline output will be stored in the "results" folder.

About

Repositiory for [TODO]

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •