This script has been altered from the original code by Mathieu Seppey available at https://gitlab.com/ezlab/busco/-/tree/master.
Its purpose is to create figures displaying the BUSCO lineage used in processing genomic, transcriptomic, or proteomic files.
Moreover, this module enables the generation of figures in 'jpeg', 'png', or 'tiff' formats.
Lastly, various pieces of code originally spread across different files have been consolidated into one file for simplicity and convenience.
usage: Busco_Plot.v1.0.0.py -wd PATH [-l {Eukaryotic Lineage,Metazoan Lineage}] [-f {jpeg,png,tiff}] [-rt RUN_TYPE] [--no_r] [-q] [-h]
####################################################################################################
ARAMAYO_LAB
BUSCO_Plot
Original Code Link: https://gitlab.com/ezlab/busco/
Original Code Version: 4.0.0
Licensed under the MIT license.
This program was modified from code initially generated by Evgeny Zdobnov ([email protected]),
and as such it inherits it's original MIT License.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See LICENSE.md file for details.
You should have received a copy of the MIT License along with this program, If not,
see: https://opensource.org/license/MIT
Author current release: Rodolfo Aramayo
WORK_EMAIL: [email protected]
PERSONAL_EMAIL: [email protected]
Author original release: Mathieu Seppey
MODULE__NAME: Busco_Plot.v1.0.0.py
MODULE_VERSION: 1.0.0
MODULE_SYNOPSIS: This module produces a graphic summary for BUSCO runs based on short summary files
This tool uses the short summary files produces by the Busco tool (https://busco.ezlab.org/ and https://gitlab.com/ezlab/busco/-/tree/master),
and ggplot2 (2.2.0+) (https://ggplot2.tidyverse.org/), to produce and execute a file containing R code needed to produce a figure in either:
'jpeg' (default), 'png', or 'tiff' formats.
This tool assumes your system is able to run R. It also uses the following R libraries:
dplyr (https://cran.r-project.org/web/packages/dplyr/readme/README.html)
tidyr (https://tidyr.tidyverse.org/)
forcats (https://forcats.tidyverse.org/)
Cairo (https://cran.r-project.org/web/packages/Cairo/index.html)
If these libraries are not installed, the tool will attempt to install them.
To use this module place all BUSCO short summary files (short_summary.[generic|specific].dataset.label.txt) in a single directory, and provide
this directory PATH to this module.
The resulting plots will be written in the same directory where the short summary files are present.
You can find both the resulting R script for customisation (if so desired) and the resulting figure in the format requested in the specified directory.
MAIN DEPENDENCY: R (https://www.r-project.org/)
####################################################################################################
required arguments:
-wd PATH, --working_directory PATH
Define the location of the working directory where the Busco files are located
-l {Eukaryotic Lineage,Metazoan Lineage}, --lineage {Eukaryotic Lineage,Metazoan Lineage}
Define the lineage used when running Busco. Default is 'Eukaryotic Lineage'. Choose between 'Eukaryotic Lineage' or 'Metazoan Lineage'.
optional arguments:
-f {jpeg,png,tiff}, --file_type {jpeg,png,tiff}
select the output file format desired: 'jpeg', 'png', or 'tiff'
-rt RUN_TYPE, --run_type RUN_TYPE
type of summary to use, `generic` or `specific`
--no_r To avoid to run R. It will just create the R script file in the working directory
-q, --quiet Disable the info logs, displays only errors
-h, --help Show this help message and exit
Distributor ID: Apple, Inc.
Description: Apple M1 Max
Release: 14.4.1
Codename: Sonoma
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.
R - ggplot2 (https://github.com/tidyverse/ggplot2)
Package: ggplot2
Version: 3.5.0
Title: Create Elegant Data Visualisations Using the Grammar of Graphics
Type: Package
Package: dplyr
Title: A Grammar of Data Manipulation
Version: 1.1.4
R - tidyr (https://tidyr.tidyverse.org/)
Package: tidyr
Title: Tidy Messy Data
Version: 1.3.1
Package: Cairo
Version: 1.6-2
Title: R Graphics Device using Cairo Graphics Library for Creating
High-Quality Bitmap (PNG, JPEG, TIFF), Vector (PDF, SVG,
PostScript) and Display (X11 and Win32) Output
R - forcats (https://forcats.tidyverse.org/)
Package: forcats
Title: Tools for Working with Categorical Variables (Factors)
Version: 1.0.0