awesome-computer-vision-models

It's the list with popular deep learning models related to classification and segmentation task

Papers

Classification models

AlexNet ('One weird trick for parallelizing convolutional neural networks') [2014]
VGG/BN-VGG ('Very Deep Convolutional Networks for Large-Scale Image Recognition') [2014]
ResNet ('Deep Residual Learning for Image Recognition') [2015]
InceptionV3 ('Rethinking the Inception Architecture for Computer Vision') [2015]
PreResNet ('Identity Mappings in Deep Residual Networks') [2016]
DenseNet ('Densely Connected Convolutional Networks') [2016]
PyramidNet ('Deep Pyramidal Residual Networks') [2016]
ResNeXt ('Aggregated Residual Transformations for Deep Neural Networks') [2016]
WRN ('Wide Residual Networks') [2016]
Xception ('Xception: Deep Learning with Depthwise Separable Convolutions') [2016]
InceptionV4/InceptionResNetV2 ('Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning') [2016]
PolyNet ('PolyNet: A Pursuit of Structural Diversity in Very Deep Networks') [2016]
DarkNet ('Darknet: Open source neural networks in C') [2016?]
ResAttNet ('Residual Attention Network for Image Classification') [2017]
CondenseNet ('CondenseNet: An Efficient DenseNet using Learned Group Convolutions') [2017]
DRN-C/DRN-D ('Dilated Residual Networks') [2017]
DPN ('Dual Path Networks') [2017]
ShuffleNet ('ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices') [2017]
DiracNetV2 ('DiracNets: Training Very Deep Neural Networks Without Skip-Connections') [2017]]
SENet/SE-ResNet/SE-PreResNet/SE-ResNeXt ('Squeeze-and-Excitation Networks') [2017]
MobileNet ('MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications') [2017]
NASNet ('Learning Transferable Architectures for Scalable Image Recognition') [2017]
AirNet/AirNeXt ('Attention Inspiring Receptive-Fields Network for Learning Invariant Representations') [2018]
BAM-ResNet ('BAM: Bottleneck Attention Module') [2018]
CBAM-ResNet ('CBAM: Convolutional Block Attention Module') [2018]
SqueezeNet/SqueezeResNet ('SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size') [2016]
SqueezeNext ('SqueezeNext: Hardware-Aware Neural Network Design') [2018]
ShuffleNetV2 ('ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design') [2018]
MENet ('Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications') [2018]
FD-MobileNet ('FD-MobileNet: Improved MobileNet with A Fast Downsampling Strategy') [2018]
MobileNetV2 ('MobileNetV2: Inverted Residuals and Linear Bottlenecks') [2018]
IGCV3 ('IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks') [2018]
DARTS ('DARTS: Differentiable Architecture Search') [2018]
PNASNet ('Progressive Neural Architecture Search') [2018]
Amoeba ('Regularized Evolution for Image Classifier Architecture Search') [2018]
MnasNet ('MnasNet: Platform-Aware Neural Architecture Search for Mobile') [2018]
IBN-Net ('Two at Once: Enhancing Learning andGeneralization Capacities via IBN-Net') [2018]
MarginNet ('Large Margin Deep Networks for Classification') [2018]
A^2 Nets ('A^2-Nets: Double Attention Networks') [2018]
FishNet ('FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction') [2018]

Model	Number of parameters	Top-1 Error	Top-5 Error
AlexNet	61.1M	44.12	21.26
VGG-16	138.3M	26.78	8.69
ResNet-50	25.5M	23.50	6.87
Inception v3	23.8M	21.2	5.6
PreResNet-50	25.5M	23.39	6.68
DenseNet-121	7.9M	25.0	7.71
PyramidNet-200(a=300)	62.1M	19.5	4.8
PyramidNet-200(a=450)	116.4M	19.2	4.7
ResNeXt-101	83.5M	20.4	5.3
WRN-50-2-bottleneck	68.9M	21.9	6.03
Xception	?	21.0	5.5
Inception-ResNet-v2	55.9M	19.9	4.9
Inception-v4	42.6M	20.0	5.0
Very Deep PolyNet	?	18.71	4.25
DarkNet Ref	7.3M	38.09	16.71
Attention-92	51.3M	19.5	4.8
CondenseNet (G=C=8)	4.8M	26.2	8.3
DRN-A-50	25.6M	22.94	6.57
DPN-131	79.3M	18.55	4.16
ShuffleNet 2×(g=3)	?	26.3	?
DiracNet-34	21.8M	27.79	9.34
SENet-154	115.2M	18.84	4.65
MobileNet	4.2M	29.4	10.5
NASNet-A	5.3M	26.0	8.7
AirNet50-1x64d (r=2)	27.43M	22.48	6.21
BAM-ResNet-50	25.92M	23.68	6.96
CBAM-ResNet-50	28.1M	23.02	6.38
SqueezeResNet	1.23M	39.83	17.84
2.0-SqNxt-23v5	3.2M	32.56	11.8
ShuffleNet v2 2x SE	7.6M	24.6	?
456-MENet-24×1(g=3)	5.3M	28.4	9.8
FD-MobileNet 1x	2.9M	34.7	?
MobileNetV2	3.4M	28.0	?
IGCV3	3.5M	28.22	9.54
DARTS	4.9M	26.9	9.0
PNASNet-5	5.1M	25.8	8.1
AmoebaNet-C	5.1M	24.3	7.6
MnasNet-92 (+SE)	5.1M	23.87	7.15
IBN-Net50-a	?	22.54	6.32
MarginNet	?	22.0	?
A^2 Net	?	23.0	6.5
FishNeXt-150	26.2M	21.5	?

Segmentation models

Semantic segmentation

U-Net ('U-Net: Convolutional Networks for Biomedical Image Segmentation') [2015]
DeconvNet ('Learning Deconvolution Network for Semantic Segmentation') [2015]
ParseNet ('ParseNet: Looking Wider to See Better') [2015]
Piecewise ('Efficient piecewise training of deep structured models for semantic segmentation') [2015]
SegNet ('SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation') [2016]
FCN ('Fully Convolutional Networks for Semantic Segmentation') [2016]
ENet ('ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation') [2016]
DilatedNet ('MULTI-SCALE CONTEXT AGGREGATION BY DILATED CONVOLUTIONS') [2016]
PixelNet ('PixelNet: Towards a General Pixel-Level Architecture') [2016]
RefineNet ('RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation') [2016]
LRR ('Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation') [2016]
FRRN ('Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes') [2016]
Semantic Segmentation using Adversarial Networks ('Semantic Segmentation using Adversarial Networks') [2016]
MultiNet ('MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving') [2016]
DeepLab ('DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs') [2017]
LinkNet ('LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation') [2017]
Tiramisu ('The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation') [2017]
ICNet ('ICNet for Real-Time Semantic Segmentation on High-Resolution Images') [2017]
ERFNet ('Efficient ConvNet for Real-time Semantic Segmentation') [2017]
PSPNet ('Pyramid Scene Parsing Network') [2017]
GCN ('Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network') [2017]
Segaware ('Segmentation-Aware Convolutional Networks Using Local Attention Masks') [2017]
PixelDCN ('PIXEL DECONVOLUTIONAL NETWORKS') [2017]
DeepLabv3 ('Rethinking Atrous Convolution for Semantic Image Segmentation') [2017]
DUC, HDC ('Understanding Convolution for Semantic Segmentation') [2018]
ShuffleSeg ('SHUFFLESEG: REAL-TIME SEMANTIC SEGMENTATION NETWORK') [2018]
AdaptSegNet ('Learning to Adapt Structured Output Space for Semantic Segmentation') [2018]
TuSimple-DUC ('Understanding Convolution for Semantic Segmentation') [2018]
R2U-Net ('Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation') [2018]
Attention U-Net ('Attention U-Net: Learning Where to Look for the Pancreas') [2018]
DANet ('Dual Attention Network for Scene Segmentation') [2018]
ENCNet ('Context Encoding for Semantic Segmentation') [2018]
ShelfNet ('ShelfNet for Real-time Semantic Segmentation') [2018]
LadderNet ('LADDERNET: MULTI-PATH NETWORKS BASED ON U-NET FOR MEDICAL IMAGE SEGMENTATION') [2018]
ССС ('Concentrated-Comprehensive Convolutions for lightweight semantic segmentation') [2018]
DifNet ('DifNet: Semantic Segmentation by Diffusion Networks') [2018]
BiSeNet ('BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation') [2018]
ESPNet ('ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation') [2018]

Model	PASCAL-Context	Cityscapes (mIOU)	PASCAL VOC 2012 (mIOU)	COCO Stuff	ADE20K VAL (mIOU)
U-Net	?	?	?	?	?
DeconvNet	?	?	72.5	?	?
ParseNet	40.4	?	69.8	?	?
Piecewise	43.3	71.6	78.0	?	?
SegNet	?	56.1	?	?	?
FCN	37.8	65.3	62.2	22.7	29.39
ENet	?	58.3	?	?	?
DilatedNet	?	?	67.6	?	32.31
PixelNet	?	?	69.8	?	?
RefineNet	47.3	73.6	83.4	33.6	40.70
LRR	?	71.8	79.3	?	?
FRRN	?	71.8	?	?	?
MultiNet	?	?	?	?	?
DeepLab	45.7	64.8	79.7	?	?
LinkNet	?	?	?	?	?
Tiramisu	?	?	?	?	?
ICNet	?	70.6	?	?	?
ERFNet	?	68.0	?	?	?
PSPNet	47.8	80.2	85.4	?	44.94
GCN	?	76.9	82.2	?	?
Segaware	?	?	69.0	?	?
PixelDCN	?	?	73.0	?	?
DeepLabv3	?	?	85.7	?	?
DUC, HDC	?	77.1	?	?	?
ShuffleSeg	?	59.3	?	?	?
AdaptSegNet	?	46.7	?	?	?
TuSimple-DUC	80.1	?	83.1	?	?
R2U-Net	?	?	?	?	?
Attention U-Net	?	?	?	?	?
DANet	52.6	81.5	?	39.7	?
ENCNet	51.7	75.8	85.9	?	44.65
ShelfNet	48.4	75.8	84.2	?	?
LadderNet	?	?	?	?	?
CCC-ERFnet	?	69.01	?	?	?
DifNet-101	45.1	?	73.2	?	?
BiSeNet(Res18)	?	?	74.7	28.1	?
ESPNet	?	?	63.01	?	?

Detection models

[R-CNN] Rich feature hierarchies for accurate object detection and semantic segmentation | Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik | [CVPR' 14] |[pdf] [official code - caffe] [2014]
[OverFeat] OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks | Pierre Sermanet, et al. | [ICLR' 14] |[pdf] [official code - torch] [2014]
[MultiBox] Scalable Object Detection using Deep Neural Networks | Dumitru Erhan, et al. | [CVPR' 14] |[pdf] [2014]
[SPP-Net] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition | Kaiming He, et al. | [ECCV' 14] |[pdf] [official code - caffe] [unofficial code - keras] [unofficial code - tensorflow] [2014]
[MR-CNN] Object detection via a multi-region & semantic segmentation-aware CNN model | Spyros Gidaris, Nikos Komodakis | [ICCV' 15] |[pdf] [official code - caffe] [2015]
[DeepBox] DeepBox: Learning Objectness with Convolutional Networks | Weicheng Kuo, Bharath Hariharan, Jitendra Malik | [ICCV' 15] |[pdf] [official code - caffe] [2015]
[AttentionNet] AttentionNet: Aggregating Weak Directions for Accurate Object Detection | Donggeun Yoo, et al. | [ICCV' 15] |[pdf] [2015]
[Fast R-CNN] Fast R-CNN | Ross Girshick | [ICCV' 15] |[pdf] [official code - caffe] [2015]
[DeepProposal] DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers | Amir Ghodrati, et al. | [ICCV' 15] |[pdf] [official code - matconvnet] [2015]
[Faster R-CNN, RPN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | Shaoqing Ren, et al. | [NIPS' 15] |[pdf] [official code - caffe] [unofficial code - tensorflow] [unofficial code - pytorch]
[YOLO v1] You Only Look Once: Unified, Real-Time Object Detection | Joseph Redmon, et al. | [CVPR' 16] |[pdf] [official code - c] [2016]
[G-CNN] G-CNN: an Iterative Grid Based Object Detector | Mahyar Najibi, et al. | [CVPR' 16] |[pdf] [2016]
[AZNet] Adaptive Object Detection Using Adjacency and Zoom Prediction | Yongxi Lu, Tara Javidi. | [CVPR' 16] |[pdf] [2016]
[ION] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks | Sean Bell, et al. | [CVPR' 16] |[pdf] [2016]
[HyperNet] HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection | Tao Kong, et al. | [CVPR' 16] |[pdf] [2016]
[OHEM] Training Region-based Object Detectors with Online Hard Example Mining | Abhinav Shrivastava, et al. | [CVPR' 16] |[pdf] [official code - caffe] [2016]
[CRAPF] CRAFT Objects from Images | Bin Yang, et al. | [CVPR' 16] |[pdf] [official code - caffe] [2016]
[MPN] A MultiPath Network for Object Detection | Sergey Zagoruyko, et al. | [BMVC' 16] |[pdf] [official code - torch] [2016]
[SSD] SSD: Single Shot MultiBox Detector | Wei Liu, et al. | [ECCV' 16] |[pdf] [official code - caffe] [unofficial code - tensorflow] [unofficial code - pytorch] [2016]
[GBDNet] Crafting GBD-Net for Object Detection | Xingyu Zeng, et al. | [ECCV' 16] |[pdf] [official code - caffe] [2016]
[CPF] Contextual Priming and Feedback for Faster R-CNN | Abhinav Shrivastava and Abhinav Gupta | [ECCV' 16] |[pdf] [2016]
[MS-CNN] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection | Zhaowei Cai, et al. | [ECCV' 16] |[pdf] [official code - caffe] [2016]
[R-FCN] R-FCN: Object Detection via Region-based Fully Convolutional Networks | Jifeng Dai, et al. | [NIPS' 16] |[pdf] [official code - caffe] [unofficial code - caffe] [2016]
[PVANET] PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection | Kye-Hyeon Kim, et al. | [NIPSW' 16] |[pdf] [official code - caffe] [2016]
[DeepID-Net] DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection | Wanli Ouyang, et al. | [PAMI' 16] |[pdf] [2016]
[NoC] Object Detection Networks on Convolutional Feature Maps | Shaoqing Ren, et al. | [TPAMI' 16] |[pdf]
[DSSD] DSSD : Deconvolutional Single Shot Detector | Cheng-Yang Fu1, et al. | [arXiv' 17] |[pdf] [official code - caffe] [2017]
[TDM] Beyond Skip Connections: Top-Down Modulation for Object Detection | Abhinav Shrivastava, et al. | [CVPR' 17] |[pdf] [2017]
[FPN] Feature Pyramid Networks for Object Detection | Tsung-Yi Lin, et al. | [CVPR' 17] |[pdf] [unofficial code - caffe] [2017]
[YOLO v2] YOLO9000: Better, Faster, Stronger | Joseph Redmon, Ali Farhadi | [CVPR' 17] |[pdf] [official code - c] [unofficial code - caffe] [unofficial code - tensorflow] [unofficial code - tensorflow] [unofficial code - pytorch] [2017]
[RON] RON: Reverse Connection with Objectness Prior Networks for Object Detection | Tao Kong, et al. | [CVPR' 17] |[pdf] [official code - caffe] [unofficial code - tensorflow] [2017]
[DCN] Deformable Convolutional Networks | Jifeng Dai, et al. | [ICCV' 17] |[pdf] [official code - mxnet] [unofficial code - tensorflow] [unofficial code - pytorch] [2017]
[DeNet] DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling | Lachlan Tychsen-Smith, Lars Petersson | [ICCV' 17] |[pdf] [official code - theano] [2017]
[CoupleNet] CoupleNet: Coupling Global Structure with Local Parts for Object Detection | Yousong Zhu, et al. | [ICCV' 17] |[pdf] [official code - caffe] [2017]
[RetinaNet] Focal Loss for Dense Object Detection | Tsung-Yi Lin, et al. | [ICCV' 17] |[pdf] [official code - keras] [unofficial code - pytorch] [unofficial code - mxnet] [unofficial code - tensorflow] [2017]
[Mask R-CNN] Mask R-CNN | Kaiming He, et al. | [ICCV' 17] |[pdf] [official code - caffe2] [unofficial code - tensorflow] [unofficial code - tensorflow] [unofficial code - pytorch] [2017]
[DSOD] DSOD: Learning Deeply Supervised Object Detectors from Scratch | Zhiqiang Shen, et al. | [ICCV' 17] |[pdf] [official code - caffe] [unofficial code - pytorch] [2017]
[SMN] Spatial Memory for Context Reasoning in Object Detection | Xinlei Chen, Abhinav Gupta | [ICCV' 17] |[pdf] [2017]
[YOLO v3] YOLOv3: An Incremental Improvement | Joseph Redmon, Ali Farhadi | [arXiv' 18] |[pdf] [official code - c] [unofficial code - pytorch] [unofficial code - pytorch] [unofficial code - keras] [unofficial code - tensorflow] [2018]
[ZIP] Zoom Out-and-In Network with Recursive Training for Object Proposal | Hongyang Li, et al. | [IJCV' 18] |[pdf] [official code - caffe] [2018]
[SIN] Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships | Yong Liu, et al. | [CVPR' 18] |[pdf] [official code - tensorflow] [2018]
[STDN] Scale-Transferrable Object Detection | Peng Zhou, et al. | [CVPR' 18] |[pdf]
[RefineDet] Single-Shot Refinement Neural Network for Object Detection | Shifeng Zhang, et al. | [CVPR' 18] |[pdf] [official code - caffe] [unofficial code - chainer] [unofficial code - pytorch] [2018]
[MegDet] MegDet: A Large Mini-Batch Object Detector | Chao Peng, et al. | [CVPR' 18] |[pdf] [2018]
[DA Faster R-CNN] Domain Adaptive Faster R-CNN for Object Detection in the Wild | Yuhua Chen, et al. | [CVPR' 18] |[pdf] [official code - caffe] [2018]
[SNIP] An Analysis of Scale Invariance in Object Detection – SNIP | Bharat Singh, Larry S. Davis | [CVPR' 18] |[pdf] [2018]
[Relation-Network] Relation Networks for Object Detection | Han Hu, et al. | [CVPR' 18] |[pdf] [official code - mxnet] [2018]
[Cascade R-CNN] Cascade R-CNN: Delving into High Quality Object Detection | Zhaowei Cai, et al. | [CVPR' 18] |[pdf] [official code - caffe] [2018]
Finding Tiny Faces in the Wild with Generative Adversarial Network | Yancheng Bai, et al. | [CVPR' 18] |[pdf] [2018]
[STDnet] STDnet: A ConvNet for Small Target Detection | Brais Bosquet, et al. | [BMVC' 18] |[pdf] [2018]
[RFBNet] Receptive Field Block Net for Accurate and Fast Object Detection | Songtao Liu, et al. | [ECCV' 18] |[pdf] [official code - pytorch] [2018]
Zero-Annotation Object Detection with Web Knowledge Transfer | Qingyi Tao, et al. | [ECCV' 18] |[pdf] [2018]
[CornerNet] CornerNet: Detecting Objects as Paired Keypoints | Hei Law, et al. | [ECCV' 18] |[pdf] [official code - pytorch] [2018]
[Pelee] Pelee: A Real-Time Object Detection System on Mobile Devices | Jun Wang, et al. | [NIPS' 18] |[pdf] [official code - caffe] [2018]
[HKRM] Hybrid Knowledge Routed Modules for Large-scale Object Detection | ChenHan Jiang, et al. | [NIPS' 18] |[pdf] [2018]
[MetaAnchor] MetaAnchor: Learning to Detect Objects with Customized Anchors | Tong Yang, et al. | [NIPS' 18] |[pdf] [2018]
[M2Det] M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network | Jun Wang, et al. | [AAAI' 19] |[pdf] [2018]

Detector	VOC07 (mAP@IoU=0.5)	VOC12 (mAP@IoU=0.5)	COCO (mAP)
R-CNN	58.5	-	-
OverFeat	-	-	-
MultiBox	29.0	-	-
SPP-Net	59.2	-	-
MR-CNN	78.2	73.9	-
AttentionNet	-	-	-
Fast R-CNN	70.0	68.4	-
Faster R-CNN	73.2	70.4	36.8
YOLO v1	66.4	57.9	-
G-CNN	66.8	66.4	-
AZNet	70.4	-	22.3
ION	80.1	77.9	33.1
HyperNet	76.3	71.4	-
OHEM	78.9	76.3	22.4
MPN	-	-	33.2
SSD	76.8	74.9	31.2
GBDNet	77.2	-	27.0
CPF	76.4	72.6	-
MS-CNN	-	-	-
R-FCN	79.5	77.6	29.9
PVANET	-	-	-
DeepID-Net	69.0	-	-
NoC	71.6	68.8	27.2
DSSD	81.5	80.0	-
TDM	-	-	37.3
FPN	-	-	36.2
YOLO v2	78.6	73.4	21.6
RON	77.6	75.4	-
DCN	-	-	-
DeNet	77.1	73.9	33.8
CoupleNet	82.7	80.4	34.4
RetinaNet	-	-	39.1
Mask R-CNN	-	-	39.8
DSOD	77.7	76.3	-
SMN	70.0	-	-
YOLO v3	-	-	33.0
SIN	76.0	73.1	23.2
STDN	80.9	-	-
RefineDet	83.8	83.5	41.8
MegDet	-	-	-
RFBNet	82.2	-	-
CornerNet	-	-	42.1

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-computer-vision-models

Papers

Classification models

Segmentation models

Semantic segmentation

Detection models

About

Releases

Packages

niranjanaryan/awesome-computer-vision-models

Folders and files

Latest commit

History

Repository files navigation

awesome-computer-vision-models

Papers

Classification models

Segmentation models

Semantic segmentation

Detection models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages