This project aims at building a neural network in order to classify superpixels generated from the SLIC algorithm. These class labels can then be used to generate a semantic segmentation map of the image.
The dataset used for this project is the MSRCv1 dataset.
The steps involved are as follows:
For each image :
-
100 superpixels were generated using the SLIC algorithm
-
For each superpixel region, the best fitting rectangle is found and the region is dilated by 3 pixels.
-
This region is extracted from the original image and saved as a npy file along with the class ID for that region.
A neural network was trained in PyTorch to classify each of the superpixels.
Network training definition :
-
A standard VGG16 network with pretrained Imagenet weights was loaded.
-
The last layer was removed a few linear layers were added to enable transfer learning.
-
Cross entropy loss was as a loss function. Stochastic gradient descent with momentum was used as the optimizer.
The trained neural network then takes each superpixel patch as an input and outputs a class ID for it. This can be used to create a semantic segmentation of the input image.
Further details and the code can be found here.