Introduction

This proposed model aims to classify lung computerised tomography (CT) scans into 3 anoynmised catergories using an ensemble of 3 models MobileNetV2, Xception and ResNet152V2. This approach achieved 94.52% accuracy on an unseen test set.

Prerequisite

python >= 3.7.9
virtualenv

Quick Start

The instructions below trains and evaluates the proposed model on the lung CT dataset.

Setup virtual environment.

virtualenv env
source env/bin/activate
pip3 install -r requirements.txt

Download the dataset and unzip the files to your desired location.

unzip -a data.zip

NOTE: Verify directory structure.

├── train_image/
│    └── train_image/
│        └── .png
├── test_image/
│    └── test_image/
│        └── .png
├── train_label.csv
└── ...

Assign the path of the unzipped dataset chosen in step 2 to SRC_PATH in the prepare_dataset.sh script.
```
vim prepare_dataset.sh
```
Execute script to train models and make predictions on test dataset.
```
bash run.sh
```

NOTE: The Lung CT scan dataset is a private dataset.

Logging

This code comes with tensorboard support to track the loss curves and evaluation metrics of the proposed model. Launch tensorboard using this command below.

tensorboard --logdir=logs

Visulisation and Tuning

This code comes with a companion jupyter notebook tune.ipynb to visualise the output of the data preprocessing and to select the best hyperparameters to train each model.

Execute the same steps 1-3 from Quick Start above.
Launch jupyter.
```
jupyter lab
```

Approach Summary

The images are first enhanced using a technique called Contrast Limited Adaptive Histogram Equalization (CLAHE) to improve contrast of the CT scans.
Data augmentation techniques, random flips and rotations are used to improve diversity of the small dataset
Each of the 3 models (MobileNetV2, Xception and ResNet152V2) are pretrained on the imagenet dataset and used as feature extractors. The models are then fine tuned by unfreezing some layers.
K-fold cross validation is used to select the best hyperparameters for each model which are then used to train a integrated stacked ensemble model.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
src		src
.gitignore		.gitignore
README.md		README.md
prepare_dataset.sh		prepare_dataset.sh
requirements.txt		requirements.txt
run.sh		run.sh
tune.ipynb		tune.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Prerequisite

Quick Start

Logging

Visulisation and Tuning

Approach Summary

About

Languages

hozongsien/lung-ct-classifier

Folders and files

Latest commit

History

Repository files navigation

Introduction

Prerequisite

Quick Start

Logging

Visulisation and Tuning

Approach Summary

About

Topics

Resources

Stars

Watchers

Forks

Languages