Skip to content

Weiqin-Zhao/Hist2Cell

Repository files navigation

Hist2Cell: Fine-grained Cell Type Prediction from Histology Images πŸ§¬πŸ”¬

Paper Python PyTorch License

Hist2Cell is a Vision Graph-Transformer framework that predicts fine-grained cell type abundances directly from histology images, enabling cost-efficient, high-resolution cellular mapping of tissues.

Overview

Predicting cellular compositions from histology images using deep learning


πŸ“– What is Hist2Cell?

Hist2Cell is a computational framework for spatial biology analysis. Instead of requiring expensive spatial transcriptomics sequencing, our framework can predict cellular compositions directly from standard histology images.

🎯 Key Innovation

  • πŸ’° Cost-Effective: Eliminates need for expensive spatial sequencing
  • πŸ”¬ High Resolution: Achieves finer spatial detail than traditional methods
  • ⚑ Fast Analysis: Real-time prediction from histology images
  • 🌍 Broad Applicability: Works across different tissue types and diseases

🧠 How It Works

Hist2Cell combines three established AI approaches:

  1. πŸ–ΌοΈ Computer Vision (ResNet18): Analyzes tissue morphology from histology images
  2. πŸ•ΈοΈ Graph Neural Networks (GAT): Models spatial relationships between tissue regions
  3. πŸ”„ Vision Transformers: Captures global tissue context and patterns

πŸš€ Quick Start Guide

Prerequisites

  • Operating System: Ubuntu 22.04.4 LTS (recommended) or similar Linux distribution
  • Hardware: GPU with 8GB+ VRAM (16GB+ recommended)
  • Python: 3.11
  • CUDA: Compatible GPU with CUDA support

πŸ”§ Installation

Step 1: Create Conda Environment

# Create a new conda environment
conda create -n Hist2Cell python=3.11
conda activate Hist2Cell

Step 2: Clone Repository

git clone https://github.com/Weiqin-Zhao/Hist2Cell.git
cd Hist2Cell

Step 3: Install Dependencies

# Install basic requirements
pip install -r requirements.txt

# Install PyTorch with CUDA support
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

# Install PyTorch Geometric
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.0.0+cu118.html

Step 4: Verify Installation

python -c "import torch; print('CUDA available:', torch.cuda.is_available())"

πŸ“š Comprehensive Tutorial System

We provide a complete learning ecosystem with step-by-step tutorials designed for different skill levels and use cases.

πŸŽ“ Learning Path for Beginners

graph TD
    A[πŸ“– Start Here: Read README] --> B[πŸ”§ Environment Setup]
    B --> C[πŸ“Š Data Preparation Tutorial]
    C --> D[🧠 Understanding Hist2Cell Architecture]
    D --> E[πŸ“ˆ Choose Your Path]
    E --> F[πŸ”¬ Research Path]
    E --> G[πŸ’» Development Path]
    F --> I[🎨 Visualization Tutorials]
    G --> J[πŸš€ Training Tutorial]
Loading

πŸ“‹ Tutorial Overview

Tutorial Purpose Skill Level Time Required Key Outcomes
πŸ”§ Data Preparation Learn to process your own spatial transcriptomics data Beginner 1-2 hours Understand data pipeline, prepare custom datasets
πŸš€ Model Training Train Hist2Cell on your own data Intermediate 2-4 hours Custom model training, hyperparameter tuning
🎨 Cell Abundance Visualization Create publication-quality spatial plots Beginner 30-60 min Generate visualization of cell distributions
πŸ•ΈοΈ Cell Colocalization Analysis Analyze spatial relationships between cell types Advanced 1-2 hours Spatial statistics, colocalization patterns
πŸ” Super-Resolution Prediction Generate enhanced resolution cell maps Advanced 1-2 hours 2Γ— resolution enhancement

πŸ—‚οΈ Project Structure

Hist2Cell/
β”œβ”€β”€ πŸ“ tutorial_data_preparation/          # Data processing tutorials
β”‚   └── data_preparation_tutorial.ipynb    # Complete data pipeline guide
β”œβ”€β”€ πŸ“ tutorial_training/                  # Model training resources
β”‚   └── training_tutorial.ipynb            # Comprehensive training guide
β”œβ”€β”€ πŸ“ tutorial_analysis_evaluation/       # Analysis and evaluation tutorials
β”‚   β”œβ”€β”€ cell_abundance_visulization_tutorial.ipynb      # Spatial visualization
β”‚   β”œβ”€β”€ cell_colocalization_tutorial.ipynb              # Spatial relationships
β”‚   └── super_resovled_cell_abundance_tutorial.ipynb    # Super-resolution analysis
β”œβ”€β”€ πŸ“ model_weights/                      # Pre-trained model checkpoints
β”œβ”€β”€ πŸ“ example_data/                       # Example datasets and demonstrations
β”‚   β”œβ”€β”€ humanlung_cell2location/          # Standard resolution data
β”‚   β”œβ”€β”€ humanlung_cell2location_2x/       # Super-resolution data
β”‚   └── example_raw_data/                 # Raw data examples
β”œβ”€β”€ πŸ“ model/                             # Core model architecture
β”œβ”€β”€ πŸ“„ requirements.txt                   # Python dependencies
└── πŸ“– README.md                          # This comprehensive guide

πŸ“Š Datasets and Resources

πŸ—„οΈ Supported Datasets

Dataset Tissue Type Use Case Availability Tutorial Coverage
Human Lung Healthy lung tissue Primary examples, tutorials βœ… Provided All tutorials
HER2ST Breast cancer Disease applications πŸ”— External Advanced usage
STNet Various tissues Method validation πŸ”— External Custom training
TCGA Cancer samples Clinical applications πŸ”— External Research projects
HEST-1k Multiple organs Large-scale analysis πŸ”— External Scalability studies

πŸ“₯ Data Download and Setup

We provide processed example data for tutorials and demonstrations. The original datasets are from the published sources listed above, but we've prepared processed versions for direct use with Hist2Cell.

# Download processed data from our OneDrive link
# Visit: example_data/README.md for the download link

# After downloading, unzip the data using:
tar -xzvf [downloaded_file.tar.gz]

# Verify example data structure
ls example_data/
# Should show: humanlung_cell2location/, humanlung_cell2location_2x/, example_raw_data/

We provide:

  • Processed example data of the healthy lung dataset in ./example_data/humanlung_cell2location
  • Super-resolved cell abundance data in ./example_data/humanlung_cell2location_2x
  • Example raw data in ./example_data/example_raw_data

For users who want to process their own datasets, we provide detailed tutorials in ./tutorial_data_preparation/data_preparation_tutorial.ipynb.


⚑ Quick Demo

Want to see Hist2Cell in action immediately? Run this quick demonstration:

# Navigate to visualization tutorial
cd tutorial_analysis_evaluation/

# Launch Jupyter notebook
jupyter notebook cell_abundance_visulization_tutorial.ipynb

# Follow the step-by-step guide to generate your first spatial cell map!

This will generate publication-quality visualizations in under an hour.


πŸ“„ Citation

If you use Hist2Cell in your research, please cite our work:

@article{zhao2024hist2cell,
  title={Hist2Cell: Deciphering Fine-grained Cellular Architectures from Histology Images},
  author={Zhao, Weiqin and Liang, Zhuo and Huang, Xianjie and Huang, Yuanhua and Yu, Lequan},
  journal={bioRxiv},
  pages={2024--02},
  year={2024},
  publisher={Cold Spring Harbor Laboratory}
}

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • 🏫 Institutions: University collaborations and support
  • πŸ’° Funding: Grant agencies and foundations
  • πŸ‘₯ Community: Contributors and early adopters
  • πŸ”¬ Datasets: Original data providers and consortiums

πŸš€ Start Your Spatial Biology Journey Today!

Ready to enhance your tissue analysis?

πŸ“– Read the Paper β€’ πŸ“š Tutorials


Supporting spatial biology research through computational methods

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published