Skip to content

VirDetect-AI is a Deep Learning model was developed for identifying partial virus protein sequences in metagenomic data

Notifications You must be signed in to change notification settings

alyzart22/VirDetect-AI

Repository files navigation

This repository contains a Deep Learning model for identifying partial virus protein sequences in metagenomic data. In this repository are available the necessary data and the environment to run the query application.

This work was published in Briefings in Bioinformatics (2025): VirDetect-AI: a residual and convolutional neural network–based metagenomic tool for eukaryotic viral protein identification. Zárate A, Díaz-González L, Taboada B. https://doi.org/10.1093/bib/bbaf001

Download Extra suplementary data of VirDetect-AI https://zenodo.org/doi/10.5281/zenodo.13328820

Watchers Stars Activity

Motodology of VirDetect-AI

Api consult VirDetect-AI

There are two options to test the VirDetect-AI tool, through a google colab notebook or locally by installing a predefined environment

Option 1 - Execute Notebook

1.- Download the notebook Notebook_api_VirDetect-AI.ipynb located in the Notebook_VirDetect-AI folder in this repository.

2.- Execute the notebook Notebook_api_VirDetect-AI.ipynb on Google colab (GPU) or jupiter. Remember that the allowed format is only Fasta and the output is generated and saved in the outputs folder, which is a temporary folder in google drive [content], remember to download your results.

Option 2 -Install API consult

  1. Clone the repository to local (or download manually all repository)
    git clone https://github.com/alyzart22/VirDetect-AI.git
    

If you have GPU Nvidia GTX or RTX *Drivers Nvidia should be updated

  1. Create enviroment

    conda env create --file ./API_VirDetect-AI/enviroments/virdetect-ai_gpu.yml 

    Activate you enviroment

    conda activate virdetect-ai_gpu 

    Execute this line in console

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/ 

    Execute this line to check that the gpu is working

    python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

    Output expected example: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

If you don´t have GPU

  1. Create enviroment
    conda env create --file ./VirDetect-AI/enviroments/virdetect-ai_cpu.yml 
    Activate you enviroment
    conda activate virdetect-ai_cpu 

Download model VirDetect-AI

  1. Download the VirDetect-AI model.h5 from the following link and place it inside the /API_VirDetect-AI/ folder. Link to download model.h5

Execute API consult VirDetect-AI

  1. In this section you can try with you own metagenomics data In this line, you can replace the hepadna.fasta file with your own FASTA file. The command accepts 3 arguments:
  • hepadna.fasta – the query containing the amino acid sequences.
  • 40 – the kmer_stride (recommended range: 20–60).
  • 0 – the execution mode:
    • Mode 0 (default): Allows input sequences ≥ 300 amino acids.
    • Mode 1: Allows input sequences > 255 amino acids.

Remember to run this command while you are inside the /VirDetect-AI/API_VirDetect-AI/ directory.

python ./api_virdetect-ai.py ./hepadna.fasta 40 0

Output Api consult VirDetect-AI

  1. The output are the following 6 pie graphs and 3 files csv, report with the predictions by kmers, prediction by sequences and sequences unknown.

Reference and citation

If you use VirDetect-AI plese cite this paper: Alida Zárate, Lorena Díaz-González, Blanca Taboada, VirDetect-AI: a residual and convolutional neural network–based metagenomic tool for eukaryotic viral protein identification, Briefings in Bioinformatics, Volume 26, Issue 1, January 2025, bbaf001, https://doi.org/10.1093/bib/bbaf001

Contact

Ali Zárate - [email protected]

Project Link: https://github.com/alyzart22/VirDetect-AI

Authors

width="40px" alt test alt test

Alida Zárate | Blanca Taboada | Lorena Díaz

Funding

This research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to Blanca Taboada.

About

VirDetect-AI is a Deep Learning model was developed for identifying partial virus protein sequences in metagenomic data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published