This use case describes a computational workflow for building a mechanistic model that captures molecular differences between two cancer subtypes, with a focus on Chronic Lymphocytic Leukaemia (CLL). The study uses RNA-Seq data and a specific clinical variable, drawing on the ICGC consortium’s data, making it potentially applicable to various cancer types. The analysis aims to understand cellular signalling differences between IGHV groups by employing tools to assess transcription factor activity and provide a signalling network, offering a mechanistic explanation for observed molecular changes. The creation of patient-specific Boolean models allows for studying individual patient trajectories, emphasizing the importance of personalized medicine and tailoring approaches to account for genomic heterogeneity in cancer. Overall, this use case showcases the application of mathematical modelling tools in personalized medicine to understand and adapt approaches based on individual patient characteristics.
The BuildingBlocks
folder contains the script to install the
Building Blocks used in the Cancer Diagnosis Workflow.
The Workflow
folder contains the workflows implementations.
Currently contains the implementation using PyCOMPSs.
The Resources
folder contains dataset files.
The Tests
folder contains the scripts that run each Building Block
used in the workflow for the given small dataset.
They can be executed individually for testing purposes.
This section explains the requirements and usage for the Cancer Diagnosis Workflow in a laptop or desktop computer.
permedcoe
package- PyCOMPSs
- Singularity
- Clone this repository:
git clone https://github.com/PerMedCoE/cancer-diagnosis-workflow
- Install the Building Blocks required for the Cancer Diagnosis Workflow:
cancer-diagnosis-workflow/BuildingBlocks/./install_BBs.sh
- Get the required Building Block images from the project B2DROP:
- Required images:
- cll_combine_models
- cll_network_inference
- cll_tf_activities
- cll_personalize_boolean_models
- cll_prepare_data
- cll_run_boolean_model
The path where these files are stored MUST be exported in the PERMEDCOE_IMAGES
environment variable.
⚠️ TIP: These containers can be built manually as follows (be patient since some of them may take some time):
- Clone the
BuildingBlocks
repositorygit clone https://github.com/PerMedCoE/BuildingBlocks.git
- Build the required Building Block images
cd BuildingBlocks/Resources/images sudo singularity build cll_combine_models.sif cll_combine_models.def sudo singularity build cll_network_inference.sif cll_network_inference.def sudo singularity build cll_tf_activities.sif cll_tf_activities.def sudo singularity build cll_personalize_boolean_models.sif cll_personalize_boolean_models.def sudo singularity build cll_prepare_data.sif cll_prepare_data.def sudo singularity build cll_run_boolean_model.sif cll_run_boolean_model.def cd ../../..
If using PyCOMPSs in local PC (make sure that PyCOMPSs in installed):
-
Go to
Workflow/PyCOMPSs
foldercd Workflows/PyCOMPSs
-
Execute
./run.sh
This section explains the requirements and usage for the Cancer Diagnosis Workflow in the MareNostrum 4 supercomputer.
- Access to MN4
All Building Blocks are already installed in MN4, and the Cancer Diagnosis Workflow available.
-
Load the
COMPSs
,Singularity
andpermedcoe
modulesexport COMPSS_PYTHON_VERSION=3 module load COMPSs/3.3 module load singularity/3.5.2 module use /apps/modules/modulefiles/tools/COMPSs/libraries module load permedcoe
TIP: Include the loading into your
${HOME}/.bashrc
file to load it automatically on the session start.This commands will load COMPSs and the permedcoe package which provides all necessary dependencies, as well as the path to the singularity container images (
PERMEDCOE_IMAGES
environment variable) and testing dataset (CANCERDiagnosisWORKFLOW_DATASET
environment variable). -
Get a copy of the pilot workflow into your desired folder
mkdir desired_folder cd desired_folder get_cancerDiagnosisworkflow
-
Go to
Workflow/PyCOMPSs
foldercd Workflow/PyCOMPSs
-
Execute
./launch.sh
This command will launch a job into the job queuing system (SLURM) requesting 2 nodes (one node acting half master and half worker, and other full worker node) for 20 minutes, and is prepared to use the singularity images that are already deployed in MN4 (located into the PERMEDCOE_IMAGES
environment variable). It uses the dataset located into ../../Resources/data
folder.
⚠️ TIP: If you want to run the workflow with a different dataset, please edit thelaunch.sh
script and define the appropriate dataset path.
After the execution, a results
folder will be available with with Cancer Diagnosis Workflow results.
This software has been developed for the PerMedCoE project, funded by the European Commission (EU H2020 951773).