-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Benchmark
Lianmin Zheng edited this page Oct 24, 2018
·
24 revisions
This page contains the benchmark results for several popular image classification models. We auto-tune all listed models on target platforms and benchmark the inference performance (time cost per image).
- Results
- Links
Note: If a board has big.LITTLE architecture, we will use all big cores. Otherwise, we will use all cores. In the following device specifications, we only list the cores being used.
- Firefly-RK3399 : 2 x Cortex A72 1.8Ghz
- Raspberry Pi 3B : 4 x Cortex A53 1.2Ghz
- Huawei P20 Pro / Mate10 Pro (Soc: HiSilicon Kirin 970) : (4 x Cortex A73 2.36GHz)
- Google Pixel 2 (Soc: Qualcomm Snapdragon 835) : (4 × Kyro 2.35 GHz)
- PYNQ (2 x Cortex-A9 650MHz)
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | squeezenet-v1.0 | squeezenet-v1.1 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|---|---|
Raspberry Pi 3B | 610.2 | 2074.2 | 121.8 | 104.8 | 320.0 | 726.0 | 185.1 | 94.0 | 1772.0 | 2119.8 |
Firefly RK3399 | 336.8 | 1304.4 | 77.9 | 64.8 | 158.6 | 403.2 | 94.3 | 48.2 | 903.5 | 1086.0 |
Huawei P20 Pro | 179.7 | 444.7 | 41.3 | 33.4 | 77.4 | 232.5 | 51.4 | 26.0 | 486.3 | 729.4 |
Google Pixel2 | 161.0 | 434.8 | 39.6 | 29.3 | 66.0 | 181.1 | 47.3 | 23.0 | 397.1 | 485.0 |
Xilinx PYNQ | 2887.0 | 9691.7 | 721.4 | 513.3 | 1231.7 | 3585.5 | 913.0 | 478.3 | -1.0 | -1.0 |
- Mali-T860 MP4: On Firefly-RK3399. Its frequency is locked to 800MHz.
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | squeezenet-v1.0 | squeezenet-v1.1 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|---|---|
Mali-T860 | 410.6 | 784.7 | 79.5 | 77.7 | 127.3 | 354.7 | 111.0 | 62.5 | 673.2 | 792.1 |
- dtype = float16 and batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | squeezenet-v1.0 | squeezenet-v1.1 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|---|---|
Mali-T860 | 295.4 | 464.9 | 52.9 | 60.7 | 84.3 | 221.0 | 77.3 | 46.7 | 405.6 | 472.8 |
- Jetson TX2: on Max-N mode 1.3GHz
- GTX 1080 TI, GTX Titan X
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|
GTX 1080 Ti | 3.6 | 5.8 | 0.7 | 1.0 | 1.1 | 2.8 | 4.2 | 4.8 |
GTX TITAN X | 5.8 | 9.9 | 1.0 | 1.6 | 1.6 | 4.3 | 6.3 | 7.4 |
Jetson TX2 | 26.8 | 45.7 | 5.2 | 8.8 | 9.6 | 26.2 | 58.2 | 68.8 |
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | resnet-18 | resnet-50 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|
Vega FE | 5.8 | 8.9 | 1.0 | 1.6 | 4.5 | 6.3 | 7.2 |
See readme page https://github.com/dmlc/tvm/tree/master/apps/benchmark on how to get these numbers.