This document explains how the code in this repository can be used to produce the results reported in the following paper:
Deep Learning on Small Datasets without Pre-Training using Cosine Loss.
Björn Barz and Joachim Denzler.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2020.
According to Table 2 in the paper:
Loss Function | CUB | NAB | Cars | Flowers | MIT 67 Scenes | CIFAR-100 |
---|---|---|---|---|---|---|
cross entropy | 51.9% | 59.4% | 78.2% | 67.3% | 44.3% | 77.0% |
cross entropy + label smoothing | 55.9% | 68.3% | 78.1% | 66.8% | 38.7% | 77.5% |
cosine loss | 67.6% | 71.7% | 84.3% | 71.1% | 51.5% | 75.3% |
cosine loss + cross entropy | 68.0% | 71.9% | 85.0% | 70.6% | 52.7% | 76.4% |
- Python >= 3.5
- numpy
- numexpr
- keras >= 2.2.0
- tensorflow (we used v1.8)
- sklearn
- scipy
- pillow
The following datasets have been used in the paper:
- Caltech UCSD Birds-200-2011 (CUB)
- North American Birds (NAB-large)
- Stanford Cars (Cars)
- Oxford Flowers-102 (Flowers)
- MIT 67 Indoor Scenes (MIT67Scenes)
- CIFAR-100 (CIFAR-100)
The names in parentheses specify the dataset names that can be passed to the scripts mentioned below.
In the following exemplary python script calls, replace $DS
with the name of the dataset (see above),
$DSROOT
with the path to that dataset, and $LR
with the maximum learning rate for SGDR.
To save the model after training has completed, add --model_dump
followed by the filename where the model definition and weights should be written to.
python learn_classifier.py \
--dataset $DS --data_root $DSROOT --sgdr_max_lr $LR \
--architecture resnet-50 --batch_size 96 \
--gpus 4 --read_workers 16 --queue_size 32 --gpu_merge
For label smoothing, add --label_smoothing 0.1
.
python learn_image_embeddings.py \
--dataset $DS --data_root $DSROOT --sgdr_max_lr $LR \
--embedding onehot --architecture resnet-50 --batch_size 96 \
--gpus 4 --read_workers 16 --queue_size 32 --gpu_merge
For the combined cosine + cross-entropy loss, add --cls_weight 0.1
.
To use semantic embeddings instead of one-hot vectors, pass a path to one of the embedding files in the embeddings
directory to --embedding
instead of onehot
.
For the CIFAR-100 dataset, use the following parameters:
python learn_classifier.py \
--dataset CIFAR-100 --data_root $DSROOT --sgdr_max_lr $LR \
--architecture resnet-110-wfc --batch_size 100
python learn_image_embeddings.py \
--dataset CIFAR-100 --data_root $DSROOT --sgdr_max_lr $LR \
--embedding onehot --architecture resnet-110-wfc --batch_size 100
For each dataset and loss function, we fine-tuned the learning rate individually by wrapping the training script calls into a bash loop like the following (here shown for training with the cosine loss on CIFAR-100 as an example):
for LR in 2.5 1.0 0.5 0.1 0.05 0.01 0.005 0.001; do
echo $LR
python learn_image_embeddings.py \
--dataset CIFAR-100 --data_root $DSROOT --sgdr_max_lr $LR \
--embedding onehot --architecture resnet-110-wfc --batch_size 100 \
2>/dev/null | grep -oP "val_(prob_)?acc: \K([0-9.]+)" | sort -n | tail -n 1
done
The following table lists the values for --sgdr_max_lr
that led to the best results.
Loss | CUB | NAB | Cars | Flowers | MIT 67 Scenes | CIFAR-100 |
---|---|---|---|---|---|---|
cross entropy | 0.05 | 0.05 | 1.0 | 1.0 | 0.05 | 0.1 |
cross entropy + label smoothing | 0.05 | 0.1 | 1.0 | 0.1 | 1.0 | 0.1 |
cosine loss (one-hot) | 0.5 | 0.5 | 1.0 | 0.5 | 2.5 | 0.05 |
cosine loss + cross entropy (one-hot) | 0.5 | 0.5 | 0.5 | 0.5 | 2.5 | 0.1 |
To experiment with differently sized variants of the CUB dataset, download the modified image list files and unzip the obtained archive into the root directory of your CUB dataset.
For training, specify the dataset name as CUB-subX
, where X
is the number of samples per class.