Multi-Label Image Classification

This project develops a multi-label classification model to assign up to 10 location labels to images from a Hanoi tourist dataset, focusing on landmarks like Ho Guom (Sword Lake) and Ho Tay (West Lake). It fine-tunes pre-trained VGG16, GoogleNet, ResNet50, and ViT, achieving high accuracy on a small dataset through data augmentation and a weighted binary cross-entropy loss.

Loss function

A weighted binary cross-entropy loss is used to address class imbalance and optimize multi-label predictions. The loss function is shown below:

Methodology

Dataset

Size: 200 images with up to 10 labels.
Data Augmentation:
- Horizontal flipping.
- Random adjustments to brightness, saturation, and contrast.
Purpose: Enhance the small dataset to improve model generalization.

Model Architecture

The project fine-tunes pre-trained models for multi-label classification:

Convolutional Base: Reused pre-trained ConvNet layers (frozen or with a low learning rate) to adapt to the dataset.
Classifier: Replaced the original classifier with new fully connected layers for 10-label output.
Models:
- VGG16: Deep 16-layer convolutional network.
- GoogleNet: Inception-based model for efficient feature extraction.
- ResNet50: 50-layer residual network for robust performance.
- ViT: Vision Transformer leveraging transformer architecture.

Result

Training and test loss curves for the models:

The models were evaluated on binary accuracy, overall accuracy, training time, inference time, and parameter count. Results are summarized below:

Model	Binary Accuracy	Accuracy	Training Time (s)	Inference Time (s)	Number of Parameters
VGG16	94%	70%	485.7	0.017	134,301,514
GoogleNet	95%	74%	219.7	0.012	26,620,586
ResNet50	97%	82%	413.8	0.014	48,723,018
ViT	95%	70%	400.0	0.0153	90,072,586

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
asset		asset
my_checkpoints		my_checkpoints
Readme.md		Readme.md
dl_group01 (1).pptx		dl_group01 (1).pptx
multi_labels_classification (resnet,vgg,googleNet).ipynb		multi_labels_classification (resnet,vgg,googleNet).ipynb
multilabel_image (vit).ipynb		multilabel_image (vit).ipynb
test.csv		test.csv
test_labels.csv		test_labels.csv
testset.zip		testset.zip
train_labels.csv		train_labels.csv
trainset.zip		trainset.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Label Image Classification

Loss function

Methodology

Dataset

Model Architecture

Result

About

Uh oh!

Releases

Packages

Languages

quachthetruong/image_multilabel

Folders and files

Latest commit

History

Repository files navigation

Multi-Label Image Classification

Loss function

Methodology

Dataset

Model Architecture

Result

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages