Skip to content

Latest commit

 

History

History
160 lines (113 loc) · 7.22 KB

File metadata and controls

160 lines (113 loc) · 7.22 KB

Interpretable-loss-functions-for-deep-learing-based-biomedical-image-segmentation-models

Members : PyeongEun Kim

Supervisors : Utku Ozbulak, Prof. Arnout Van Messem, Prof. Wesley De Neve

Technical Report

Detailed information regarding this repository can be found in the technical report. The technical report (not completed yet) can be downloaded from the dropbox. Click here to download the file.

Description

After the initial break-through performances of encoder-decoder style deep learning models, multiple loss functions were proposed to improve the functionality and the effectiveness of these models in the segmentation of biomedical images, such as cross-entropy loss, the focal loss, and the Dice loss. However, despite their critical role, the researches on the interpretability of the loss functions are lacking. As a result, no clear answer on which loss function is most suitable in training the models for biomedical image segmentation. Thus, to enhance the understanding of the loss functions, we aim to investigate the nature of different loss functions. Also, we will propose a visual tool to illustrate the loss surfaces of different loss functions throughout the training of deep segmentation model.

Table of Content

Dataset

We obtained the dataset from the paper “Estimation of the Relative Amount of Hemoglobin in the Cup and Neuroretinal Rim Using Stereoscopic Color Fundus Images”. The example of the eye data and its corresponding ground truth mask is shown in the figure below. The eye image is RGB, with the dimension of 428x569 (width x height). In the dataset, there are 159 eye images and corresponding ground truth masks for optic disks and other parts. White pixels in the ground truth mask represent the optic disks area in the eye image and black pixels in the ground truth mask represent the non-optic disk area of the eye image. The dataset is divided into 150 images in the training set and 9 images in the test set for training and testing, respectively. Note that our data has a class imbalance problem that pixels of optic disks are only 10% of the total image.

Model

Architecture

The model that we used is U-net (original paper), one of the first groundbreaking encoder-decoder style deep learning-based model for image segmentation. The architecture of U-net model is illustrated in the figure below.

Loss functions

In this project, we used the two most popular loss functions: the cross-entropy loss and the focal loss functions.

Cross-entropy loss function

Focal loss function

Results

We evaluated the difference between the loss functions by observing the performance of the model. The performance of the model is determined by three criteria: pixel-wise accuracy, intersection-over-union (IoU), and prediction confidence. We focused on comparing the difference in these criteria between the cross- entropy loss with different class weights and the focal losses with different γ. The results are illustrated graphs below.

Accuracy of prediction
Intersection-over-union (IOU)
Prediction confidence of black pixels
Prediction confidence of black pixels

Visualization of loss surfaces

Refer to the visualization of loss surface section on the technical report (you can download the technical report from above link).

Dependency

Following modules are used in the project:

* python >= 3.6
* numpy >= 1.14.5
* torch >= 0.4.0
* PIL >= 5.2.0
* matplotlib >= 2.2.2

References :

[1] O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation, http://arxiv.org/pdf/1505.04597.pdf

[2] P.Y. Simard, D. Steinkraus, J.C. Platt. Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis, http://cognitivemedium.com/assets/rmnist/Simard.pdf