This README outlines the steps required to reproduce the approach presented in the manuscript Rappez et al. DeepCycle reconstructs a cyclic cell cycle trajectory from unsegmented cell images using convolutional neural networks. Molecular Systems Biology, 2020. These instructions show how to run the model on the data used in the paper and on new unseen data.
keras, UMAP, cv2, albumentations, classification_models
Install customized version of SOMPY
- Download data from EBI BioStudies repository and deep learning models from our EMBL hosting
- Unzip to
data/Timelapse_2019
folder preserving directory structure. You will have:root| |-data| | |-Timelapse_2019| | |-BF/ | |-Cy3/ | |-DAPI/ | |-GFP/ | |-curated_tracks.csv | |- ... |-src/ |...
Alternatively, DeepCycle can be run on newly generated data. For that, prepare the data as follows:
- Organize the live-imaging microsocpy. Each channel of every timepoint are stored independently as:
root| |-data| | |-New_timelapse_data| | |-BF/ | |-Cy3/ | |-DAPI/ | |-GFP/
- Run TrackMate FIJI plugin on the nuclear staining channel (DAPI in our case). The output should be called
Spots in tracks statistics.csv
. - Run
src/TrackMate_filter.py
with the correct paths (see comments in file) to manually curate the tracks with one or two divisions.
cd src
- Prepare the data:
python data_prepare.py
- Cleans and removes unnecessary columns. Stores as
statistics_clean.csv
indata/Timelapse_2019
dir - Aligns the curated tracks based on division events, calculates mean intensities track/frame wise. Stores as
intensities.csv
- Calculates intensity statistics and adds virtual class
1-4
to each tracked cell. Resulting data to be stored instatistics_mean_std.csv
- Cleans and removes unnecessary columns. Stores as
- Train the model:
python model_train.py
Trains the model on curated tracks (less double division tracks) using double division tracks as validation set. Saves best models incheckpoints
dir - Generate cell descriptors with
checkpoint.r34.sz48.03-0.73.hdf5
as default model:- from validation set (double division tracks) only:
python encode.py --mode encode_val
- from all available tracks:
python encode.py --mode encode_all
Descriptors are saved indescriptors.r34.sz48.pkl
anddescriptors_all.r34.sz48.pkl
indata/Timelapse_2019
dir.
- from validation set (double division tracks) only:
- Generate embeddings for all dataset. Compute intense, consider using supplied
embeddings_preds_all_batch<i>.npz
instead:
python all_cells_prediction.py
cd ..
- start
jupyter notebook
and opentimelapse_projection2019.ipynb