This repository contains code for a masking approach using the SAM-ViT (Self-Attention Mechanism for Vision Transformers) model, tailored for segmentation tasks.
The code provided in this repository includes:
Functions to load and pad images and masks from specified folders. Patching large images into smaller patches for efficient processing. Creation of a custom dataset (SAMDataset) using the MONAI library for medical imaging.
Initialization and configuration of the SAM-ViT model (SamModel) using the Huggingface Transformers library. Optimization of the model setup to freeze parameters of the vision and prompt encoders.
Configuration of training parameters such as epochs, learning rate, and optimizer (Adam). Implementation of a training loop that iterates through epochs, computes losses, and updates model parameters. Saving of model weights after each epoch to specified directories.
Utilization of MONAI library's loss functions (DiceCELoss, etc.) for calculating segmentation losses during training.
- Install required packages:
pip install torch torchvision numpy matplotlib monai transformers
To train the model for different class change the image_folder and mask-folder path