MultiOmics-Cancer-Classification

The pipeline for dimensionality reduction and resampling of TCGA data, for the research on "Evaluation of machine learning-based cancer classification techniques using multi-omics data."

Description

The provided R Markdown files and Python scripts offer a comprehensive pipeline for transforming and preparing multi-omics data sets, including genomic (copy number), transcriptomic (gene expression), and epigenomic (DNA methylation) data.

R Markdown files of applying PCA to cleaned TCGA data
R Markdown files of applying Tomek Links and Near Miss to address class imbalance
Alternative Python script for Tomek Links

Dataset

Cleaned TCGA data is be available here Or Hugging Face Dataset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MultiOmics-Cancer-Classification

Description

Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

MultiOmics-Cancer-Classification

Description

Dataset