Skip to content

A list of R environment based tools for marker gene microbiome data exploration, statistical analysis and visualization

Notifications You must be signed in to change notification settings

joshualiuxu/Tools-Microbiome-Analysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 

Repository files navigation

DOI

A list of R environment based tools for 16S rRNA gene data exploration, statistical analysis and visualization

As a beginner, the entire process from sample collection to analysis for sequencing data is a daunting task. More specifically, the downstream processing of raw reads is the most time consuming and mentally draining stage. It is vital to understand the basic concepts in microbial ecology and then to use various tools at disposal to address specific research questions. Thankfully, several young researchers supported by their experienced principal investigators/supervisors are working on creating various tools for analysis and interpretation of microbial community data. A major achievement of the scientific community is the open science initiative which has led to sharing of knowledge worldwide. For microbial community analysis, several tools have been created in R, a free to use (GNU General Public License) programming language(Team, 2000). The power of R lies in its ease of working with individuals lacking programming skills and easy sharing of analysis scripts codes and packages aiding reproducibility. Using tools such as QIIME (the newer QIIME2) (Caporaso, Kuczynski, Stombaugh et al., 2010), Mothur (Schloss, Westcott, Ryabin et al., 2009), DADA2 (Callahan, McMurdie, Rosen et al., 2016) one can get from raw reads to species × samples table (OTU or ASVs amplicon sequence variants as suggested recently (Callahan, McMurdie & Holmes, 2017)). In this post, numerous resources that can be helpful for analysis of microbiome data are listed. This list may not have all the packages as this tool development space is ever growing. Feel free to add those packages or links to web tutorials related to microbiome data, there is a google docs excel sheet at this link for a list of tools which can be edited to include more tools. These are mostly for improving statistical analysis and visualisation. These tools provide convenient options for data analysis and include several steps where the user has to make decisions. The work by McMurdie PJ, Holmes S, Weiss S and Tsilimigras M.C. and Fodor A.A are useful resources to understand the data common to microbiome census. It can be tricky and frustrating in the beginning but patience and perseverance will be fruitful at the end (personal experience).

Tools:

  1. Ampvis2 Tools for visualising amplicon sequencing data
  2. CCREPE Compositionality Corrected by PErmutation and REnormalization
  3. DADA2 Divisive Amplicon Denoising Algorithm
  4. DESeq2 Differential expression analysis for sequence count data
  5. edgeR empirical analysis of DGE in R
  6. mare Microbiota Analysis in R Easily
  7. Metacoder An R package for visualization and manipulation of community taxonomic diversity data
  8. metagenomeSeq Differential abundance analysis for microbial marker-gene surveys
  9. microbiome R package Tools for microbiome analysis in R
  10. MINT Multivariate INTegrative method
  11. mixDIABLO Data Integration Analysis for Biomarker discovery using Latent variable approaches for ‘Omics studies
  12. mixMC Multivariate Statistical Framework to Gain Insight into Microbial Communities
  13. MMinte Methodology for the large-scale assessment of microbial metabolic interactions (MMinte) from 16S rDNA data
  14. pathostat Statistical Microbiome Analysis on metagenomics results from sequencing data samples
  15. phylofactor Phylogenetic factorization of compositional data
  16. phylogeo Geographic analysis and visualization of microbiome data
  17. Phyloseq Import, share, and analyze microbiome census data using R
  18. qiimer R tools compliment qiime
  19. RAM R for Amplicon-Sequencing-Based Microbial-Ecology
  20. ShinyPhyloseq Web-tool with user interface for Phyloseq
  21. SigTree Identify and Visualize Significantly Responsive Branches in a Phylogenetic Tree
  22. SPIEC-EASI Sparse and Compositionally Robust Inference of Microbial Ecological Networks
  23. structSSI Simultaneous and Selective Inference for Grouped or Hierarchically Structured Data
  24. Tax4Fun Predicting functional profiles from metagenomic 16S rRNA gene data
  25. taxize Taxonomic Information from Around the Web
  26. labdsv Ordination and Multivariate Analysis for Ecology
  27. Vegan R package for community ecologists
  28. igraph Network Analysis and Visualization in R
  29. MicrobiomeHD A standardized database of human gut microbiome studies in health and disease Case-Control
  30. Rhea A pipeline with modular R scripts
  31. microbiomeutilities Extending and supporting package based on microbiome and phyloseq R package
  32. breakaway Species Richness Estimation and Modeling

Google doc link

Useful resources are provided by:

  1. Ben J. Callahan and Colleagues: Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses.
  2. Comeau AM and Colleagues: Microbiome Helper: a Custom and Streamlined Workflow for Microbiome Research
  3. Shetty SA, Lahti L., et al: Tutorial from microbiome data analysis spring school 2018, Wageningen University and Research

Note: A good practise is to use Rmarkdown for documenting your results and sharing with your collaborators and supervisors. For an introduction to RStudio and an RStudio Overview

View this webiste repository on GitHub
Follow me on Twitter
googlescholar
ORCID ID: 0000-0001-7280-9915

References:

  1. Callahan, B. J., McMurdie, P. J. & Holmes, S. P. (2017). Exact sequence variants should replace operational taxonomic units in marker gene data analysis. bioRxiv, 113597.
  2. Callahan, B. J., McMurdie, P. J., Rosen, M. J., Han, A. W., Johnson, A. J. A. & Holmes, S. P. (2016). DADA2: high-resolution sample inference from Illumina amplicon data. Nature methods 13, 581-583.
  3. Caporaso, J. G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F. D., Costello, E. K., Fierer, N., Peña, A. G., Goodrich, J. K. & Gordon, J. I. (2010). QIIME allows analysis of high-throughput community sequencing data. Nature methods 7, 335-336.
  4. Schloss, P. D., Westcott, S. L., Ryabin, T., Hall, J. R., Hartmann, M., Hollister, E. B., Lesniewski, R. A., Oakley, B. B., Parks, D. H. & Robinson, C. J. (2009). Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and environmental microbiology 75, 7537-7541.
  5. Team, R. C. (2000). R language definition. Vienna, Austria: R foundation for statistical computing.

Was this website/resource useful for you? Then please share it with others too!

You can cite this resource as:
Shetty SA and Lahti L (2018). A list of R environment based tools for 16S rRNA gene data exploration, statistical analysis and visualization. DOI

About

A list of R environment based tools for marker gene microbiome data exploration, statistical analysis and visualization

Resources

Stars

Watchers

Forks

Packages

No packages published