Skip to content

The pipeline for dimensionality reduction and resampling of TCGA data, for the research on "Evaluation of machine learning-based cancer classification techniques using multi-omics data."

Notifications You must be signed in to change notification settings

YytRecg/MultiOmics-Cancer-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MultiOmics-Cancer-Classification

The pipeline for dimensionality reduction and resampling of TCGA data, for the research on "Evaluation of machine learning-based cancer classification techniques using multi-omics data."

Description

The provided R Markdown files and Python scripts offer a comprehensive pipeline for transforming and preparing multi-omics data sets, including genomic (copy number), transcriptomic (gene expression), and epigenomic (DNA methylation) data.

  • R Markdown files of applying PCA to cleaned TCGA data
  • R Markdown files of applying Tomek Links and Near Miss to address class imbalance
  • Alternative Python script for Tomek Links

Dataset

Cleaned TCGA data is be available here Or Hugging Face Dataset

About

The pipeline for dimensionality reduction and resampling of TCGA data, for the research on "Evaluation of machine learning-based cancer classification techniques using multi-omics data."

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published