Skip to content

Classes for loading transcriptome and proteome data for deep learning model training.

License

Notifications You must be signed in to change notification settings

xomicsdatascience/RnaToProteinDataModule

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Manubot Project

This repository contains the analysis code for the Manubot project Multi-Dataset Integration and Residual Connections Improve Proteome Prediction from Transcriptomics Using Deep Learning.

Scripts

Files used for running our code are found under the scripts directory. The files under scripts/NAS are based on the Multi-Objective NAS with Ax tutorial. The files under scripts/5x2 were for generating the boxplot inside figure 1. scripts/SHAP contains preliminary code for running SHAP on the model. Note that the branch input_residual_connection contains a more complete setup for the SHAP runs used in the paper.

src

The two main files to consider are src/RnaToProteinDataModule/NasModel.py, which contains the deep learning model used, and src/RnaToProteinDataModule/RnaToProteinDataModule.py, which performs the bulk of the data processing steps.

M1 Mac Issues

pytorch often gives Mac users trouble. Below is a script that bypasses most errors when installing the necessary packages for a Mac.

conda create -n <env name>
conda activate <env name>
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 -c pytorch
pip3 install ax-platform
pip3 install pytorch-lightning
pip3 install -U 'tensorboardX'
pip3 install tensorboard
pip3 install torchx
pip install -e <path to RnaToProteinDataModule>

In a nutshell: Running a GPU on M1 Macs doesn't always work perfectly well. Interestingly, the most recent pytorch versions sometimes run with the mps GPU without a hitch, but other times will not. I don't know why the inconsistency exists. However, installing packages with the above script has it working every time. I do not recommend trying this in a pre-made environment, as things like the python version might skew the install into not working again.

About

Classes for loading transcriptome and proteome data for deep learning model training.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages