Skip to content

Commit

Permalink
Merge pull request #11 from TeamNCMC/generalize-preprocess
Browse files Browse the repository at this point in the history
Pre-processing scripts
  • Loading branch information
GuillaumeLeGoc authored Jan 14, 2025
2 parents 16629f3 + c76079e commit 71344df
Show file tree
Hide file tree
Showing 6 changed files with 253 additions and 126 deletions.
3 changes: 3 additions & 0 deletions docs/guide-create-pyramids.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ This script is standalone, eg. it does not rely on the `cuisto` package. But ins

`pyramid-creator` moved to a standalone package that you can find [here](https://github.com/TeamNCMC/pyramid-creator#pyramid_creator) with [installation](https://github.com/TeamNCMC/pyramid-creator#install) and [usage](https://github.com/TeamNCMC/pyramid-creator#usage) instructions.

!!! info
You might also have to pre-process your images if there are debris or other artifacts in them. Check the [pre-processing guide](tips-preprocessing.md).

## Installation
You will find instructions on the dedicated project page over at [Github](https://github.com/TeamNCMC/pyramid-creator#pyramid_creator).

Expand Down
96 changes: 96 additions & 0 deletions docs/tips-preprocessing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Image pre-processing

Preparing slides before image acquisition can be a tedious task : it happens that some slices are flipped (either upside-down or left/right), put too close from each other (resulting in a part of a different slice being visible in an image), too close from the slide edge...
In such cases, one might need to clean the image so that only the actual slice is visible in the image.

## Pre-processing scripts
Two scripts are provided in `scripts/preprocessing` to this end. They require first to export the images from the microscope software to standard image files with metadata (eg. [OME-TIFF](tips-formats.md#metadata) files).

The process is then :

1. Split each channel in single-channel images,
1. Detect automatically the brain contour in the specified target channel,
1. Save the resulting brain mask as an image,
1. Apply the mask to all channels and save resulting cleaned images,
1. Review manually the masks, if not satisfied, manually edit the correspond single-channel image in ImageJ,
1. Rerun the brain contour detection and re-apply the masks to all channels,
1. Merge cleaned channels in a multi-channel, pyramidal OME-TIFF image ready to be used in QuPath.

The first script, `preprocess_split_channels.py` handles steps 1-6, `preprocess_merge_channel.py` takes care of the last step.

!!! info
The reason we need to split channels is to get images that can be easily openned in a third-party software such as ImageJ for conveninent editing.

## Usage
First and foremost, export the images from the microscope software to OME-TIFF. For Zeiss ZEN, have a look at [this guide](guide-create-pyramids.md#export-czi-to-ome-tiff). Say the images were exported to a directory called `~/input_directory/`.

### Split channels and find brain mask
Copy the script `preprocess_split_channels.py` located in `scripts/preprocessing` on your computer. Read the options at the top of the script and edit according to your need.

Especially, the `TASKS` dictionnary what actions are to be performed.

This script will :

1. (if `move=True`) Move images from `~/input_directory` to `~/images/merged_original/`. The files will be renamed depending on the options set in the script header. The `IN_PREFIX` parameter allows the slice number to be parsed. The `OUT_PREFIX` is the prefix of the renamed image and all subsequent use.

??? Example
ZEN exported images named : `A1A4_s1.ome.tiff`, `A1A4_s2.ome.tiff`, ...
Setting `IN_PREFIX` to `"_s"` and `OUT_PREFIX` to `animalid_` will result in image being moved from `~/input_directory/animalid_s1.ome.tiff` to `~/images/animalid_001.ome.tiff`, and so on. The `images` folder name is customizable but will always be in the parent directory of `input_drectory`.

2. (if `split=True`) While moving and renaming the image, it will also read the actual image data, and split each channel in separate single-channel images. The image files will have the same name and are stored in `~/ch01`, `~/ch02`... folders.
3. (if `clean=True`) The parameter `DETECTION_CHANNEL` sets which channel will be used to find the brain contour. The corresponding single-channel file is read, [brain detection](#brain-contour-detection) is performed, the resulting mask is saved in `~images/masks`. Since the image is already loaded, the mask is also applied directly to it, and the cleaned, masked image is saved in `~/images/chXX_cleaned`, where `XX` corresponds to `DETECTION_CHANNEL`.

??? Info
If the mask image file already exists, the image is skipped. Likewise, if `overwrite_cleaned` is turned off (eg. set to `False`), if an image with the same name already exist in the `chXX_cleaned` folders, it will be skipped.

4. The mask is subsequently applied to all other channels in the same manner : cleaned images have the same name as the renamed original file, and stored in their respective `chXX_cleaned` folders.
5. Visually assess the quality of the masks stored in `~/images/masks/`. Previews are generated in the `previews` folder. If they are satisfactory, skip to the [next section](#merge-channels).

If for some images the mask is not satisfactory, note down their names and :

1. Delete the mask file (not the preview !).
2. Detele the corresponding cleaned images in each channel.
3. Open ImageJ, drag & drop the corresponding single-channel original image from the channel used for detection.
4. Manually edit it so that the brain slice is easily detected. This means deleting the bits not part of the slice, usually when those bits are close to the slice itself. One could for instance use the `Freehand selections` tool, select the parts to remove and hit ++del++.
5. Save the image (++ctrl+s++), overwritting the original.
6. Repeat for each un-satisfactory mask.
7. Back to the script, turn off `reformat` and `split` in `TASKS`, since that's already done. Only the missing masks will be computed, and only the missing images from the `chXX_cleaned` folders will be written (unless `overwrite_cleaned` is set to `True`).

??? Example
Automatic brain contour detection failed for `animalid_012.tiff`.
I delete `~/images/masks/animalid_012.tiff`. I also delete `~/images/ch01_cleaned/animalid_012.tiff`, `~/images/ch02_cleaned/animalid_012.tiff` and `~/images/ch03_cleaned/animalid_012.tiff`.
I drag & drop `~/images/ch01/animalid_012.tiff` in ImageJ, draw the brain contour manually with Freehand selections tool, invert the selection, hit ++del++, save the image, overwritting it.
Finally, I edit the script, setting `reformat=False` and `split=False` in `TASKS`, and re-run the script. Only one mask will be computed and applied.

Now, we only have to merge all the channels back to single pyramidal OME-TIFF images ready to be used in QuPath.

### Merge channels
Copy the `preprocess_merge_channels.py` script on your computer.

This one is more straighfoward :

1. Fill the input directory. This is where the script can find each `chXX_cleaned` folders, `~/images/` in the example above.
2. Fill the output directory. This could be for instance `~/images/merged_cleaned/`.
3. Fill the `CHANNELS` parameters. This is a dictionnary, setting the name and color of each channel. The order is important, it needs to be sorted as the `chXX_cleaned` folders are.

??? Example
The first channel (`ch01_cleaned`) corresponds to the NISSL staining imaged in the CFP channel, the second channel (`ch02_cleaned`) corresponds to the EGFP channel. `CHANNELS` would then look like : `{"CFP": (0, 0, 255), "EGFP": (0, 255, 0)}`.

4. Fill the pyramids and tiles options. The default value should work fine for most use cases.
5. Run the script. Images in `OUTPUT_DIRECTORY` are ready to be added to a QuPath project !

!!! danger Important
The pixel size is read from the OME-TIFF files and propagated along the pre-processing steps until the final images, so make sure it is correct when exporting the files from the microscope software.

### Brain contour detection
The algorithm to detect the brain contour is defined in the function `find_brain_mask()` in the `preprocess_split_channels.py` script. All the parameters are customizable in the `DETECTION_PARAMETERS` variable.
In a nutshell :

1. Zeroes are replaced with a fixed background value (`bkg`). This is to account when manually removing parts in ImageJ, the image background will be high compared to the 0 induced by this operation and edge detection will be sub-optimal.
2. The image is downsampled (`downscale`) for performance -- the full resolution is not needed.
3. Edge filter with the Canny algorithm (using `cannysigma` and `cannythresh`), implemented in [scikit-image](https://scikit-image.org/docs/stable/api/skimage.feature.html#skimage.feature.canny).
4. Morphological closing (dilation followed by erosion) to keep only "big" objects, using `closeradius`.
5. Fill the holes.
6. Keep only the biggest remaining object.
7. Resize the mask to the original image resolution.

1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ nav:
- tips-formats.md
- tips-qupath.md
- tips-brain-contours.md
- tips-preprocessing.md
- main-configuration-files.md
- Examples:
- main-using-notebooks.md
Expand Down
15 changes: 11 additions & 4 deletions scripts/preprocessing/preprocess_invert_orientation.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
"""Simple script to change the order of files.
"""
Simple script to change the order of files.
Used to transform file names (xxx_001.tiff) to reverse their order, to go from
caudo-rostral to rostro-caudal.
Used to transform file names (xxx_001.tiff) to reverse their order, eg.
for a 30 image stack, xxx_001.tiff becomes xxx_030.tiff, xxx_030.tiff becomes
xxx_001.tiff, and so on.
"""

Expand All @@ -12,22 +14,27 @@
input_directory = "/path/to/directory" # path to tiff files
output_directory = "/path/to/directory/new" # output directory, must be different
file_extension = ".ome.tiff" # file extension with dots
file_prefix = "mouse0_" # full prefix before the numbering digits
file_prefix = "animal0_" # full prefix before the numbering digits
ndigits = 3 # number of digits for numbering (both inputs and outputs)
dry_run = True # if True, do not actually rename the files

# list available files
list_files = [
filename
for filename in os.listdir(input_directory)
if filename.startswith(file_prefix) & filename.endswith(file_extension)
]

# count files
nfiles = len(list_files)
# reverse indices
new_numbers = np.arange(nfiles, 0, -1)

# create output directory if necessary
if not os.path.isdir(output_directory):
os.mkdir(output_directory)

# loop over images, build new name and rename
for oldi, newi in enumerate(new_numbers):
old_name = f"{file_prefix}{str(oldi + 1).zfill(ndigits)}{file_extension}"
new_name = f"{file_prefix}{str(newi).zfill(ndigits)}{file_extension}"
Expand Down
58 changes: 30 additions & 28 deletions scripts/preprocessing/preprocess_merge_channels.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
"""
Script for preprocessing.
Merge channels found in wdir/Stack_RIP/ch*_cleaned, and create pyramidal OME-TIFF.
Specify options at the top of file.
`CHANNELS` must be ordered as the channels in the Stack_RIP directory.
To be used after preprocess_split_channels.py and manual review of brain masks.
This script merges channels found in input_dir/ch*_cleaned, and create pyramidal
OME-TIFF ready to be used in QuPath.
Double check channel names and colors.
Specify options at the top of file.
`CHANNELS` must be ordered as the channels in the input_dir directory.
Credits to Christoph Gohlke, see
https://forum.image.sc/t/creating-a-multi-channel-pyramid-ome-tiff-with-tiffwriter-in-python/76424/4
Double check channel names and colors, and run the script.
author : Guillaume Le Goc ([email protected])
version : 2024.11.19
version : 2025.1.14
"""

Expand All @@ -24,13 +25,18 @@
from tqdm import tqdm

# --- Parameters
EXPID = "animal0"
# where to find chXX_cleaned folders
INPUT_DIRECTORY = r"E:\projects\histo\data\GN121\images"
# where to save merged images
OUTPUT_DIRECTORY = os.path.join(INPUT_DIRECTORY, "merged_cleaned")

# channels settings : dict mapping channel name to an RGB color. The order must be the
# same as the channels order in the Stack_RIP directory.
# same as the channels order in the input directory.
CHANNELS = {
"CFP": (0, 0, 255),
"EGFP": (0, 255, 0),
"DsRed": (255, 0, 0),
"Cy5": (255, 0, 255),
}

# pyramidal ome-tiff settings
Expand All @@ -41,10 +47,8 @@

IN_EXT = "tiff"

# working directory
WDIR = "path/to/data"


# --- Functions
def rgb_to_int(rgb):
"""Convert RGB color tuple to integer for OME-TIFF specs.
Alpha channel is set to 0.
Expand Down Expand Up @@ -188,9 +192,10 @@ def im_downscale(img, downfactor, **kwargs):


def process_directory(
expid: str,
levels: tuple,
input_directory: str,
output_directory: str,
channels: dict,
levels: tuple,
):
"""
Merge TIFF stacks representing different channels and create pyramidal OME-TIFF.
Expand All @@ -208,22 +213,16 @@ def process_directory(
"""
# --- Preparation
wdir = os.path.abspath(WDIR)

# build directories names
inpdir = os.path.join(wdir, expid, "images")
outdir = os.path.join(wdir, expid, "images", "merged_cleaned_pyramid")

# create directory if it does not exist
if not os.path.isdir(outdir):
os.makedirs(outdir)
if not os.path.isdir(output_directory):
os.makedirs(output_directory)

# list channel directories
chandirslist = [
os.path.join(inpdir, directory)
for directory in os.listdir(inpdir)
os.path.join(input_directory, directory)
for directory in os.listdir(input_directory)
if (
os.path.isdir(os.path.join(wdir, expid, "images", directory))
os.path.isdir(os.path.join(input_directory, directory))
and directory.startswith("ch")
and directory.endswith("cleaned")
)
Expand Down Expand Up @@ -256,7 +255,9 @@ def process_directory(
pbar = tqdm(imgslist)
for imgfile in pbar:
# build output image name
imgout = os.path.join(outdir, os.path.splitext(imgfile)[0] + ".ome.tiff")
imgout = os.path.join(
output_directory, os.path.splitext(imgfile)[0] + ".ome.tiff"
)

if os.path.isfile(imgout):
continue
Expand Down Expand Up @@ -308,7 +309,8 @@ def process_directory(
# --- Call
if __name__ == "__main__":
process_directory(
EXPID,
LEVELS,
INPUT_DIRECTORY,
OUTPUT_DIRECTORY,
CHANNELS,
LEVELS,
)
Loading

0 comments on commit 71344df

Please sign in to comment.