Skip to content

jyyulab/bulkRNAseq_quantification_pipeline

Repository files navigation

Bulk RNA-seq Quantification Pipeline 2025

Overview

Picture

This pipeline is designed to accurately quantify gene and transcript abundance from bulk RNA-seq data. By integrating both alignment-free and alignment-based methods, it enables cross-validation to ensure robust and reliable quantification results.

As illustrated above, the pipeline consists of three stages:

1. Preprocessing

The pipeline accepts raw input files in variable formats (e.g., FASTQ, BAM/SAM) and processes them to generate standard-in-format, clean-in-sequence FASTQ files. These preprocessed files are optimized for downstream quantification analysis.

2. Quantification

In this stage, the pipeline quantifies the abundance of both genes and transcripts. It supports three well-established and widely-used quantifiers:

  • Salmon: An alignment-free quantifier known for its wicked-fast speed and comarable accuracy.

  • RSEM: An alignment-based quantifier with exceptional accuracy. It has been used as gold standard in many benchmarking studies.

  • STAR: An alignment-based quantifier featured by splice-aware alignment. This is the tool used by GDC mRNA quantification analysis pipeline.

3. Summarization

The pipeline generates a comprehensive HTML report for each sample, detailing quantification results, alignment statistics, correlation analyses, gene body coverage visualizations, and more. For multiple samples, it produces a unified summary report and a master gene expression matrix including all samples, which can be directly utilized for downstream analyses such as NetBID.

Tutorial

A detailed tutorial to set up and run this pipeline can be found here: https://jyyulab.github.io/bulkRNAseq_quantification_pipeline/.

About

Quantification Pipelines on RNA-Seq Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •