Skip to content

DEIB-GECO/BioGAN

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 

Repository files navigation

BioGAN: Enhancing Transcriptomic Data Generation with Biological Knowledge

BioGAN is a novel generative framework that integrates Graph Neural Networks (GNNs) into a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) to generate biologically realistic synthetic transcriptomics data. The model leverages known gene-gene relationships to guide the generation process, aiming to improve the biological plausibility, fairness, and privacy of synthetic omics data.

This project accompanies the manuscript:

"BioGAN: Enhancing Transcriptomic Data Generation with Biological Knowledges" β€” Francesca Pia Panaccione, Sofia Mongardi, Marco Masseroli and Pietro Pinoli (under review)


🧬 Motivation

Modern AI tools in genomics are limited by data scarcity, heterogeneity, acquisition costs, and privacy regulations (e.g., GDPR, HIPAA). BioGAN addresses these issues by:

  • Synthesizing realistic transcriptomic profiles conditioned on biological structure.
  • Incorporating prior biological knowledge through GNNs to improve fidelity.
  • Providing a versatile architecture adaptable to various omics settings.

Key Features

  • Graph-aware generator: Integrates GNN layers to model regulatory interactions.
  • High-dimensional support: Designed for transcriptomic data with tens of thousands of features.
  • Robust validation: Multi-faceted evaluation pipeline including:
    • Classification accuracy on downstream tasks
    • Distributional similarity metrics (e.g., Wasserstein distance)
    • Feature-level consistency checks

πŸ“ Repository Structure

src/
β”œβ”€β”€ BIOGAN_H2GCN/          # Experimental variant of BioGAN using H2GCN architecture
β”œβ”€β”€ BioGAN_GCN/            # Core BioGAN model implementation based on GCN
β”œβ”€β”€ wpgan/                 # Implementation of Wasserstein GAN with Gradient Penalty
β”œβ”€β”€ metrics/               # Evaluation metrics for synthetic data quality
β”œβ”€β”€ utils/                 # Utility functions (e.g., graph processing, logging)
β”œβ”€β”€ data_loader.py         # Scripts for preprocessing transcriptomic data
β”œβ”€β”€ losses.py              # Custom loss functions (Wasserstein, gradient penalty, etc.)
β”œβ”€β”€ train_model.py         # Main training script for the BioGAN model

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%