An implementation of an image processing pipeline and using machine learning algorithims capable of identify acanthocytes on blood. The first simple implementation follows the method described in the following paper. Actually, the code is being improved to a more complete aproach to detect and classify these abnormal cells, to produce results more precious.
The pre-processing pipeline apply the following steps:
- Convert image to gray scale
- Apply 9x9 median filter to remove noise
- Convert to binary using Otsu thresholding method
- Filling operation to remove holes
- Apply morphological reconstruction (elliptic shaped 9x9 kernel) to remove remove the medium-sized noise
- Finally, canny edge detector is to apply to extract region contours
Features extracted:
- Histogram from the chain code
- Circularity
- Roundness
- Aspect-ratio
- Solidity
Algorithms implemented:
- kNN
- Logistic Regression
Other algorithms used to compare results (future implementation):
- Naive Bayes
- Decision Tree
- Random Forest
- Support Vector Machine
- Neural Network
The published paper is available on this link.
The code requires the following libraries:
The code also uses two other libraries, however they are distributed as single header dropin:
- nlohmann/json for json manipulation
- adishavit/argh for argument manipulation
Finally the code was written with C++17 features, that allow us to have access to filesystem functionalities independent from the operative system.
There was special care to improve the protability of the code.
The code provide a Makefile for compiling the code. It should work on must of the Linux distribution.
The code is comprise of two main programs:
- train: used to create a kNN model
- main: uses the previsouly learned model to classify several medical images.
In order to facilite the execution of the code the project already provides a file structure:
.
+-- resources
| +-- model -> where the kNN models are stored
| +-- test -> where the images used for testing are stored
| +-- train
| +-- bad -> where the anomalous instances are stored
| +-- good -> where the healthy instances are stored
Finally, each main programs has several parameters. The help message of each one of them is printed bellow:
$ ./train -h
Program used to train a kNN model to identify anomalous blood cells.
usage: train [-p] [-k] [-i] [-o] [-h]
Parameters:
-p, the preprocessig method [default = 0]
-m, ML model (0 - ARFF; 1 - KNN; 2 - LR) [default = 0]
-k, the number of nearest neighbors [default = 1]
-d, Minkowski distance of order p [default = 2]
-i, the input folder with images to train [default = './resources/train/']
-o, the output model [default = './resources/model/model.json']
-v, verbose
-h, this help message
$ ./main -h
Program used to identify anomalous blood cells.
usage: main [-p] [-k] [-i] [-o] [-h]
Parameters:
-p, the preprocessig method [default = 0]
-m, the classification model [default = './resources/model/model.json']
-i, the folder with images to classify [default = './resources/test/']
-v, verbose
-h, this help message
- Catarina Silva - catarinaacsilva
This project is licensed under the MIT License - see the LICENSE file for details