Workflow to reconstruct multiple metabolic graphs directly from sequence fasta.
Table of contents
This workflow is licensed under the GNU GPL-3.0-or-later, see the LICENSE file for details.
These tools are needed:
- Prokka
- EggNOG-mapper
- Bakta
- Pathway Tools (which needs Blast)
And some python packages:
To run annotation based reconstruction, you need to install Pathway Tools. This tool is available at the Pathway Tools website.
You also should install the MetaCyc_XX.X.padmet (the version number of MetaCyc is replaced with XX.X), and then you should update your config.txt files for each study. This is the way to get a MetaCyc_XX.padmet file: Firstly, download the flat files of MetaCyc in DAT format at the https://biocyc.org/download.shtml webpage. Secondly, put all the downloaded DAT files in a directory (it is named FLAT_DIR here). Thirdly run this command:
padmet pgdb_to_padmet --pgdb=FLAT_DIR --output=metacyc_XX.X.padmet --version=XX.X --db=Metacyc -v
If you have installed all the dependencies, you can just install MeReco with:
pip install mereco
metabolic_reconstruction.py [-h] -i INPUT -o OUTPUT --tax TAXFILE --padmet_ref PATH_TO_PADMET_REF --ptsc PTSC --ptsi PTSI [--annot ANNOT] [--egg_path EGG_PATH] [--bak_path BAK_PATH]
[-c CPUS] [-k TO_KEEP] [-q]
-k flag can be used to save some intermediary files from Prokka and Bakta (listed blow). To keep some specific files, mention their extension separated by ",", following the structure below :
Prokka : .ecn,.err,.ffn,.fixed*,.fsa,.gff,.log,.sqn,.tbl,.val,.faa Bakta : .embl,.faa,.ffn,.fna,.gff3,.hypotheticals.faa,.hypotheticals.tsv,.json,.log,.png,.svg,.tsv