This repository contains the code and configuration files for training a transformer-based foundation model for fragmentomics.
config/
: Contains YAML configuration files for various components of the project, including data modules, models, trainers, and machine settings.fragformer/
: The core package containing the implementation of the foundation model, including:datamodule/
: Data loading and processing moduleslit_modules/
: PyTorch Lightning modulesmodels/
: Model architecturestransforms/
: Data transformation and augmentationutils/
: Utility functions
tools/
: Contains scripts for training, testing, and other utilities.
To start training the model:
python tools/pretrain.py
For distributed training on a SLURM cluster, use:
sbatch tools/pretrain_fragformer.sh
The project uses Hydra for configuration management. You can modify the configurations in the config/
directory to adjust model parameters, data processing, and training settings.
This project is licensed under the terms included in the LICENCE.txt
file.
For questions or feedback, please open an issue in this repository.