We have added support for Amlogic NPUs (Neural Processing Unit) acceleration in version v3.9.0. You'll be amazed to see UltimateALPR running at up to 64fps (High Definition[HD/720p] resolution) on a $99 ARM device (Khadas VIM3). The engine can run at up to 90fps on low resolution images.
This guide will focus on how to use UltimateALPR on Kadas VIM3 but any SBC (Single Board Computer) with Amlogic NPU will work fine (e.g. Banana Pi).
Your Khadas VIM3 will likely come with an Android 9 installed on the eMMC. Unfortunately that's a 32-bit Android OS and not suitable for high performance applications. You'll need to install a Linux AArch64 OS from Khadas website: https://docs.khadas.com/linux/firmware/Vim3UbuntuFirmware.html. We're using version 4.9 (https://doubango.org/khadas_images/VIM3_Ubuntu-server-focal_Linux-4.9_arm64_SD-USB_V1.0.9-211217.img.xz) but any version should work. Please note that the Mainline Kernel images do not support NPU, make sure to install the right Linux version (see above).
You don't need to override the Android OS from the eMMC, install the Linux OS on an external SD card. Your Khadas will choose the OS on the SD card at boot time. This is the healthiest way to test NPU acceleration on Linux without overwriting the OS on the eMMC. Once you're happy with the result you could install the Linux OS on the eMMC which is faster than the SD card (memory read/write). You just need to remove the SD card for the boot loader to choose Android (on the eMMC) again.
When I run uname -a
on my device I see Linux Khadas 4.9.241 #22 SMP PREEMPT Fri Dec 17 17:34:50 CST 2021 aarch64 aarch64 aarch64 GNU/Linux
We do not recommend upgrading your OS. More at https://groups.google.com/g/doubango-ai/c/Q8C6cZnObtU
To enable NPU acceleration:
- you'll need to set the JSON configuration entry npu_enabled to
true
(by default it's already set to true). This could be done by using command param--npu_enabled true
when using the recognizer or the benchmark application. - your hardware name must be listed in supported_hardware.txt (case insensitive). If that's not the case, then edit the file to add it. To find your hardware name, run
cat /proc/cpuinfo | grep Hardware
We'll run the benchmark sample application on Khadas VIM3 to see how fast UltimateALPR is on that device. We'll run the benchmark with and without NPU acceleration to see the boost.
- make sure your device has enough power
- make sure your CPU isn't throttling or overheating:
cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq
- make sure that your CPU power management is
Performance
and notPowersave
:cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
- make sure to unplug your device and let it coool down if your performance numbers aren't as good as what we're reporting here
The benchmark application is ran on Khadas VIM3 Basic edition (Linux 4.9) using a 720p (1280x720) image. This is a large image (1280x720), you can try with smaller image to see how fast the engine would be. Notice how fast the engine is when parallel mode is enabled. Please note that parallel mode isn't available on Python, you'll have to use C++, Java, C# or any other language.
To run the benchmark application with 0.2 positive rate (20% of the images will have plates) for 100 loops:
cd ulatimateALPR-SDK/binaries/linux/aarch64
chmod +x benchmark
LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./benchmark \
--positive ../../../assets/images/lic_us_1280x720.jpg \
--negative ../../../assets/images/london_traffic.jpg \
--assets ../../../assets \
--npu_enabled true \
--charset latin \
--loops 100 \
--rate 0.2 \
--parallel true
- Change
--npu_enabled true
to enable/disable NPU acceleration - Change
--parallel true
to enable/disable parallel mode.--parallel false
to use sequential mode insteal of parallel mode.
0.0 rate | 0.2 rate | 0.5 rate | 0.7 rate | 1.0 rate | |
---|---|---|---|---|---|
Khadas VIM3 Basic Linux 4.9, NPU, Parallel mode |
1560 millis 64.08 fps |
1797 millis 55.63 fps |
1876 millis 53.29 fps |
2162 millis 46.25 fps |
2902 millis 34.45 fps |
Khadas VIM3 Basic Linux 4.9, NPU, Sequential mode |
1776 millis 56.30 fps |
3443 millis 29.04 fps |
6009 millis 16.63 fps |
7705 millis 12.97 fps |
10275 millis 9.73 fps |
Khadas VIM3 Basic Linux 4.9, CPU, Parallel mode |
4187 millis 23.88 fps |
4414 millis 22.65 fps |
4824 millis 20.72 fps |
5189 millis 19.26 fps |
5740 millis 17.42 fps |
Khadas VIM3 Basic Linux 4.9, CPU, Sequential mode |
4184 millis 23.89 fps |
5972 millis 16.74 fps |
8513 millis 11.74 fps |
10258 millis 9.74 fps |
12867 millis 7.77 fps |
- When parallel mode is enabled we'll perform detection using the NPU and OCR using the CPU in parallel.
- Notice how the parallel mode is 4 times faster than the sequential mode when rate=1.0 (all 100 images have plates).