Getting started
- Requirements
- Before trying to use the SDK on Jetson
  - Building optimized models
Benchmark
Jetson Nano A01/B01 versus Raspberry Pi 4
Pre-processing operations
Coming next
Known issues and possible fixes
- Warnings you can safely ignore
- Failed to open file
Technical questions

This document is about NVIDIA TensorRT in general but will focus on NVIDIA Jetson devices (TX1, TX2, Nano, Xavier AGX/NX...).

Starting version 3.13.0 we support full GPGPU acceleration for NVIDIA Jetson devices using NVIDIA TensorRT (no longer need TF-TRT).

The SDK was tested using JetPack 4.4.1 and JetPack 5.1.0, the latest version from NVIDIA and we will not provide technical support if you're using any other version.
The binaries for Jetson are under binaries/jetson

Getting started

As explained above, we use NVIDIA TensorRT to run the deep learning models on GPU.

NVIDIA TensorRT is used for:
- License plate and car detection
- License Plate Country Identification (LPCI)
- Vehicle Color Recognition (VCR)
- Vehicle Make Model Recognition (VMMR)
- Vehicle Body Style Recognition (VBSR)
- Vehicle Direction Tracking (VDT)
- Vehicle Speed Estimation (VSE)
- License Plate Recognition (LPR)

Requirements

We require JetPack 4.4.1 or JetPack 5.1.0. As of today (February 20, 2023), version 5.1.0 is the latest one. If you run apt-cache show nvidia-jetpack | grep "Version:", you'll have:

Version: 4.4.1-b50\nVersion: 4.4-b186\nVersion: 4.4-b144 if you're using Jetpack 4.4.1
Version: 5.1-b147 if you're using Jetpack 5.1.0

Supported devices (check https://developer.nvidia.com/embedded/jetpack for up to date info):

Jetpack 5.1.0: Jetson AGX Orin 32 GB production module, Jetson AGX Orin Developer Kit, Jetson Orin NX 16GB production module, Jetson AGX Xavier series, Jetson Xavier NX series modules, Jetson AGX Xavier Developer Kit and Jetson Xavier NX Developer Kit.
Jetpack 4.4.1: Jetson Nano, Jetson Xavier NX, Jetson TX1 and Jetson TX2.

Before trying to use the SDK on Jetson

Please note that this repo doesn't contain optimized TensorRT models and you'll not be able to use the SDK unless you generate these models. More info about model optimization at https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html. Fortunately we made this task very easy by writing an optimizer using TensorRT C++.

Building optimized models

This process will write the optimized models (a.k.a plans) to the local disk which means we'll need write permission. We recommend running the next commands as root(#) instead of normal user($). To generate the optimized models:

Navigate to the jetson binaries folder: cd ultimateALPR-SDK/binaries/jetson/aarch64
Generate the optimized models: sudo chmod +x ./prepare.sh && sudo ./prepare.sh

This will build the models using CUDA engine and serialize the optimized models into assets/models.tensorrt/optimized. Please note that the task will last several minutes and you must be patient. Next time you run this task it will be faster as only newest models will be generated. So, you can interrupt the process and next time it will continue from where it ended the last time.

Models generated on a Jetson device with Compute Capabilities X and TensorRT version Y will only be usable on devices matching this configuration. For example, you'll not be able to use models generated on Jetson TX2 (Compute Capabilities 6.2) on a Jetson nano (Compute Capabilities 5.3).

Benchmark

Here are some benchmark numbers to compare the speed. For more information about the positive rate, please check https://www.doubango.org/SDKs/anpr/docs/Benchmark.html. The benchmark application is open source and could be found at samples/c++/benchmark.

Before running the benchmark application:

For Jetson nano, make sure you're using a Barrel Jack (5V-4A) power supply instead of microUSB port (5V-2A)
Put the device on maximum performance mode: sudo nvpmodel -m 2 && sudo jetson_clocks.
Make sure all CPU cores are online: cat /sys/devices/system/cpu/online

To run the benchmark application for binaries/jetson with 0.2 positive rate for 100 loops:

cd ulatimateALPR-SDK/binaries/jetson/aarch64
chmod +x benchmark
LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./benchmark \
    --positive ../../../assets/images/lic_us_1280x720.jpg \
    --negative ../../../assets/images/london_traffic.jpg \
    --assets ../../../assets \
    --charset latin \
    --loops 100 \
    --rate 0.2 \
    --parallel true

	0.0 rate	0.2 rate	0.5 rate	0.7 rate	1.0 rate
binaries/jetson (Xavier NX, JetPack 5.1.0)	657 millis 152 fps	744 millis 134 fps	837 millis 119 fps	961 millis 104 fps	1068 millis 93 fps
binaries/linux/aarch64 (Xavier NX, JetPack 5.1.0)	7498 millis 13.33 fps	8281 millis 12.07 fps	9421 millis 10.61 fps	10161 millis 9.84 fps	11006 millis 9.08 fps
binaries/jetson (Nano B01, JetPack 4.4.1)	2920 millis 34.24 fps	3102 millis 32.23 fps	3274 millis 30.53 fps	3415 millis 29.27 fps	3727 millis 26.82 fps
binaries/linux/aarch64 (Nano B01, JetPack 4.4.1)	4891 millis 20.44 fps	6950 millis 14.38 fps	9928 millis 10.07 fps	11892 millis 8.40 fps	14870 millis 6.72 fps

binaries/linux/aarch64 contains generic Linux binaries for AArch64 (a.k.a ARM64) devices. All operations are done on CPU. The performance boost between this CPU-only version and the Jetson-based ones may not seem impressive but there is a good reason: binaries/linux/aarch64 uses INT8 inference while the Jetson-based versions use a mix of FP32 and FP16 which means more accurate. Providing INT8 models for Jetson devices is on our roadmap with no ETA.

Jetson Nano B01/A01 versus Raspberry Pi 4

On average the SDK is 3 times faster on Jetson nano compared to Raspberry Pi 4 and this may not seem impressive but there is a good reason: binaries/raspbian/armv7l uses INT8 inference while the Jetson-based binaries (binaries/jetson uses a mix of FP32 and FP16 which means more accurate. Providing INT8 models for Jetson devices is on our roadmap with no ETA.

Pre-processing operations

Please note that some pre-processing operations are performed on CPU and this why the CPU usage is at 1/5th. You don't need to worry about these operations, they are massively multi-threaded and entirely written in assembler with SIMD NEON acceleration. These functions are open source and you can find them at:

Normalization: compv_math_op_sub_arm64_neon.S
Chroma Conversion (YUV -> RGB): compv_image_conv_to_rgbx_arm64_neon.S
Type conversion (UINT8 -> FLOAT32): compv_math_cast_arm64_neon.S
Packing/Unpacking: compv_mem_arm64_neon.S
Scaling: compv_image_scale_bilinear_arm64_neon.S
...

Coming next

Version 3.1.0 is the first release to support NVIDIA Jetson and there is room for optimizations. Adding support for full INT8 inference could improve the speed by up to 700%. We're also planing to move the NMS layer from the GPU to the CPU and rewrite the code in assembler with NEON SIMD.

Known issues and possible fixes

Warnings you can safely ignore

All warnings from NVIDIA logger will be logged as errors on model optimization process. You can safely ignore the following messages:

Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
53 weights are affected by this issue: Detected subnormal FP16 values.
The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.

Failed to open file

You may receive [UltAlprSdkTRT] Failed to open file error after running ./prepare.sh script if we fail to write to the local disk. We recommend running the script as root(#) instead of normal user($).

Technical questions

Please check our discussion group or twitter account

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jetson.md

Jetson.md

Getting started

Requirements

Before trying to use the SDK on Jetson

Building optimized models

Benchmark

Jetson Nano B01/A01 versus Raspberry Pi 4

Pre-processing operations

Coming next

Known issues and possible fixes

Warnings you can safely ignore

Failed to open file

Technical questions

Files

Jetson.md

Latest commit

History

Jetson.md

File metadata and controls

Getting started

Requirements

Before trying to use the SDK on Jetson

Building optimized models

Benchmark

Jetson Nano B01/A01 versus Raspberry Pi 4

Pre-processing operations

Coming next

Known issues and possible fixes

Warnings you can safely ignore

Failed to open file

Technical questions