Skip to content
Lianmin Zheng edited this page Aug 22, 2018 · 24 revisions

ARM CPU

Note: If a board has big.LITTLE architecture, we will use all big cores. Otherwise, we will use all cores. In the following device specifications, we only list the cores being used. We use dtype=float32 and batch_size=1 for the following benchmark.

  • Firefly-RK3399 : 2 x Cortex A73 1.8Ghz
--------------------------------------------------
Network Name         Mean Inference Time (std dev)
--------------------------------------------------
squeezenet v1.1      48.87 ms            (1.07 ms)
mobilenet            82.16 ms            (0.09 ms)
resnet-18            162.55 ms           (0.14 ms)
vgg-16               912.44 ms           (0.32 ms)
  • Raspberry Pi 3B : 4 x Cortex A53 1.2Ghz
--------------------------------------------------
Network Name         Mean Inference Time (std dev)
--------------------------------------------------
squeezenet v1.1      92.34 ms            (0.07 ms)
mobilenet            145.22 ms           (0.11 ms)
resnet-18            325.06 ms           (0.23 ms)
vgg-16               crashed due to out of memeory
  • Huawei P20 Pro / Mate10 Pro (Soc: HiSilicon Kirin 970) : (4 x Cortex A73 2.36GHz)
--------------------------------------------------
Network Name         Mean Inference Time (std dev)
-------------------------------------------------
squeezenet v1.1      27.53 ms            (1.14 ms)
mobilenet            46.53 ms            (0.31 ms)
resnet-18            76.74 ms            (0.18 ms)
vgg-16               479.84 ms           (0.92 ms)
  • Google Pixel 2 (Soc: Qualcomm Snapdragon 835) : (4 × Kyro 2.35 GHz)
--------------------------------------------------
Network Name         Mean Inference Time (std dev)
--------------------------------------------------
squeezenet v1.1      23.57 ms            (0.42 ms)
mobilenet            40.73 ms            (0.11 ms)
resnet-18            63.95 ms            (0.03 ms)
vgg-16               407.75 ms           (9.57 ms)
  • PYNQ (2 x Cortex-A9 650MHz)
--------------------------------------------------
Network Name         Mean Inference Time (std dev)
--------------------------------------------------
squeezenet v1.1      452.40 ms           (0.09 ms)
mobilenet            772.16 ms           (0.25 ms)
resnet-18            1243.49 ms          (0.67 ms)
vgg-16               crashed due to out of memeory

NVIDIA GPU

  • 1080 Ti dtype=float32 and batch_size=1

Network Name Mean Inference Time (std dev)

resnet-50 2.95 ms (0.01 ms) mobilenet 0.63 ms (0.04 ms) vgg-19 4.83 ms (0.04 ms) inception_v3 6.17 ms (0.01 ms)

Reproduce

see readme page here https://github.com/dmlc/tvm/tree/master/apps/benchmark on how to get these numbers

Clone this wiki locally