Skip to content

Graph perceiver network for lung tumor and premalignant lesion stratification from histopathology

Notifications You must be signed in to change notification settings

vkola-lab/ajpa2024

Repository files navigation

Graph perceiver network for lung tumor and premalignant lesion stratification from histopathology

This work is published in the [American Journal of Pathology].
@article {Gindra2024,
	author = {Rushin H. Gindra and Yi Zheng and Emily J. Green and Mary E. Reid and Sarah A. Mazzilli and Daniel T. Merrick and Eric J. Burks and Vijaya B. Kolachalama and Jennifer E. Beane},
	title = {Graph perceiver network for lung tumor and bronchial premalignant lesion stratification from histopathology},
	year = {2024},
	doi = {10.1016/j.ajpath.2024.03.009},
	URL = {https://doi.org/10.1016/j.ajpath.2024.03.009},
	journal = {American Journal of Pathology}
}
Graph Perceiver Network schematic
Key Ideas & Main Findings

We hypothesize that computational methods can help capture tissue heterogeneity in histology whole slide images (WSIs) and stratify PMLs by histologic severity or their ability to progress to invasive carcinoma, providing an informative pipeline for assessment of premalignant lesions. Graph Perceiver Networks is a generalized architecture that integrates the graph module with the perceiver architecture, enabling sparse graph computations on visual tokens and computationally efficient modeling of the perceiver. The architecture reduces the computational footprint significantly compared to state-of-the-art WSI analysis architectures, thus allowing for extremely large WSIs to be processed efficiently.

As a bonus, the architecture is explainable and can be trained on large batch-sizes without the extreme computational head, thus making it a suitable candidate for further academic research lab centric projects. Based on Pytorch and Pytorch-Geometric.

Table of Contents

Updates / TODOs

Please follow this GitHub for more updates.

  • [] Remove dead code in Repository
  • [] Pre-requisites and installations (Conda env & docker container)
  • [] Add data downlaod + preprocessing steps (python file)
  • [] Also add data tree structure (For easy understanding)
  • [] Add pretrained model weights + instructions for training & evaluation (python files)
  • [] Add code for K-NN evaluation (jupyter notebook)
  • [] Add code for visualization (jupyter notebook)
  • [] Explanatory Heatmaps
  • Contact for Issues
  • Acknowledgements, License & Usage

Pre-requisites and Installations

Conda installation and Potentially a docker container at some point.

Data Download and Preprocessing

Data Download

Resections

Biopsies

  • UCL: Lung biopsy samples from University College London. To download the biopsy WSIs (formatted as .ndpi) and associated clinical metadata, please refer to the Imaging Data Resources repository, IDR 0082. WSIs from repository can be downloaded using Aspera protocol
  • Roswell: Lung biopsy samples from Roswell park comprehensive cancer institute.
Example Directory
└──TCGA_ROOT_DIR/
	└──	TCGA_train.txt
	└──	TCGA_test.txt
	└──	TCGA_plot.txt
	└── WSIs/
		├── slide_1.svs
		├── slide_2.svs
		└── ...
	└── ctranspath_pt_features/
		└── slide_1/
			├── adj_s_ei.pt
			├── adj_s.pt
			├── c_idx.txt
			├── edge_attr.pt
			└── features.pt
		└── slide_2/
		└── ...		
	└── patches256
		└── slide_1/
			├── 20.0
				├── x_y.png
				└── ...
		└── slide_2
		└── ...
└──CPTAC_ROOT_DIR/
	└──	CPTAC_test.txt
	└──	CPTAC_plot.txt
	└── WSIs/
	└── ctranspath_pt_features/
	└── patches256
└── ...
Each data cohort is organized as its own folder in [TCGA|CPTAC|UCL|Roswell]_ROOT_DIR.

For preprocessing (patching, feature extraction and graph construction), see preprocessing/graph_construction.py

You can train your model on a multi-centric dataset with the following k-fold cross validation (k=5) scheme where -- (train) ** (val), and ## (test).

[--|--|--|**|##]
[--|--|**|##|--]
[--|**|##|--|--]
[**|##|--|--|--]
[##|--|--|--|**]

Pretrained Model Weights and Training Instructions

Models were trained for 30 epochs with a batch size of 8 using 5-fold cross-validation. These models were evaluated using an internal TCGA test set per fold and the CPTAC external test set.

GPU Hardware used for training: Nvidia GeForce RTX 2080ti - 11GB.

Note: Ideally, longer training with larger batch sizes would demonstrate larger gains in the models performance.

  • Links to download pretrained model weights.
Arch SSL Method Dataset Epochs Cross-Attn-Nodes Performance(Acc) Download
Graph Perceiver Network CTransPath TCGA 30 200 N/A N/A
Graph Perceiver Network SimCLR-Lung NLST-TMA 30 200 N/A N/A
  • Instructions for training and evaluating the models (include python files).

Evaluation and Testing

  • TCGA (internal test set)
  • CPTAC (external test set)
  • K-NN evaluation: Description of the K-NN evaluation process. Link or embedded Jupyter notebook for K-NN evaluation

Explanatory Heatmaps

  • Details about the visualization techniques used.
  • Link or embedded Jupyter notebook for visualization.

Contact for Issues

  • Please open new threads or report issues directly (for urgent blockers) to [email protected] Immediate response to minor issues may not be available.

Acknowledgements, License, and Usage

  • Credits and acknowledgements.
  • License information.
  • Usage guidelines and restrictions.

About

Graph perceiver network for lung tumor and premalignant lesion stratification from histopathology

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published