GluonCV 0.7.0 Release
Highlights
GluonCV 0.7 added our latest backbone network: ResNeSt, and the derived models for semantic segmentation and object detection. We achieve significant performance improvement on all three tasks.
Image Classification
GluonCV now provides the state-of-art image classification backbones that can be used by various downstream tasks. Our ResNeSt outperforms EfficientNet in accuracy-speed trade-off as shown in the following figures. You can now swap in our new ResNeSt in your research or product to get immediate performance improvement. Checkout the detail in our paper: ResNeSt: Split Attention Network
Here is a comparison between ResNeSt and EfficientNet. The average latency is computed using a single V100 on a p3dn.24xlarge machine with a batch size of 16.
Model | input size | top-1 acc (%) | avg latency (ms) | |
---|---|---|---|---|
SENet_154 | 224x224 | 81.26 | 5.07 | previous |
ResNeSt50 | 224x224 | 81.13 | 1.78 | v0.7 |
ResNeSt101 | 256x256 | 82.83 | 3.43 | v0.7 |
ResNeSt200 | 320x320 | 83.90 | 9.49 | v0.7 |
ResNeSt269 | 416x416 | 84.54 | 19.50 | v0.7 |
Object Detection
We add two new ResNeSt based Faster R-CNN model. Noted that our model is trained using 2x learning rate schedule instead of the 1x schedule used in our paper. Our two new models are 2-4% higher on COCO mAP than our previous best model “faster_rcnn_fpn_resnet101_v1d_coco”. Notebly, our ResNeSt-50 based model has a 4.1% higher mAP than our previous ResNet-101 based model.
Model | Backbone | mAP | |
---|---|---|---|
Faster R-CNN | ResNet-101 | 40.8 | previous |
Faster R-CNN | ResNeSt-50 | 42.7 | v0.7 |
Faster R-CNN | ResNeSt-101 | 44.9 | v0.7 |
Semantic Segmentation
We add ResNeSt-50 and ResNeSt-101 based DeepLabV3 for semantic segmentation task on ADE20K dataset. Our new models are 1-2.8% higher than our previous best. Similar to our detection result, ResNeSt-50 performs better than ResNet-101 based model. DeepLabV3 with ResNeSt-101 backbone achieves a new state-of-the-art of 46.9 mIoU on ADE20K validation set, which outperform previous best by more than 1%.
Model | Backbone | pixel Accuracy | mIoU | |
---|---|---|---|---|
DeepLabV3 | ResNet-101 | 81.1 | 44.1 | previous |
DeepLabV3 | ResNeSt-50 | 81.2 | 45.1 | v0.7 |
DeepLabV3 | ResNeSt-101 | 82.1 | 46.9 | v0.7 |
Bug fixes and Improvements
- Instructions for achieving 25.7 min Mask R-CNN training.
- Fix R-CNNs export