Welcome to our instructional guide for inference and realtime DNN vision library for NVIDIA Jetson Nano/TX1/TX2/Xavier.
This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.
Vision primitives, such as imageNet
for image recognition, detectNet
for object localization, and segNet
for semantic segmentation, inherit from the shared tensorNet
object. Examples are provided for streaming from live camera feed and processing images. See the API Reference section for detailed reference documentation of the C++ and Python libraries.
There are multiple tracks of the tutorial that you can choose to follow, including Hello AI World for running inference and transfer learning onboard your Jetson, or the full Two Days to a Demo tutorial for training on a PC or server with DIGITS.
It's recommended to walk through the Hello AI World module first to familiarize yourself with machine learning and inference with TensorRT, before proceeding to training in the cloud with DIGITS.
- Hello AI World
- Two Days to a Demo
- API Reference
- Code Examples
- Pre-Trained Models
- System Requirements
- Extra Resources
> Jetson Nano Developer Kit and JetPack 4.2.2 is now supported in the repo.
> See our latest technical blog including benchmarks,Jetson Nano Brings AI Computing to Everyone
.
> Hello AI World now supports Python and onboard training with PyTorch!
Hello AI World can be run completely onboard your Jetson, including inferencing with TensorRT and transfer learning with PyTorch. The inference portion of Hello AI World - which includes coding your own image classification application for C++ or Python, object detection, and live camera demos - can be run on your Jetson in roughly two hours or less, while transfer learning is best left to leave running overnight.
- Setting up Jetson with JetPack
- Building the Project from Source
- Classifying Images with ImageNet
- Locating Objects with DetectNet
- Semantic Segmentation with SegNet
- Transfer Learning with PyTorch
The full tutorial includes training in the cloud or PC, and inference on the Jetson with TensorRT, and can take roughly two days or more depending on system setup, downloading the datasets, and the training speed of your GPU.
- DIGITS Workflow
- DIGITS System Setup
- Setting up Jetson with JetPack
- Building the Project from Source
- Classifying Images with ImageNet
- Using the Console Program on Jetson
- Coding Your Own Image Recognition Program
- Running the Live Camera Recognition Demo
- Re-Training the Network with DIGITS
- Downloading Image Recognition Dataset
- Customizing the Object Classes
- Importing Classification Dataset into DIGITS
- Creating Image Classification Model with DIGITS
- Testing Classification Model in DIGITS
- Downloading Model Snapshot to Jetson
- Loading Custom Models on Jetson
- Locating Objects with DetectNet
- Detection Data Formatting in DIGITS
- Downloading the Detection Dataset
- Importing the Detection Dataset into DIGITS
- Creating DetectNet Model with DIGITS
- Testing DetectNet Model Inference in DIGITS
- Downloading the Detection Model to Jetson
- DetectNet Patches for TensorRT
- Detecting Objects from the Command Line
- Multi-class Object Detection Models
- Running the Live Camera Detection Demo on Jetson
- Semantic Segmentation with SegNet
Below are links to reference documentation for the C++ and Python libraries from the repo:
C++ | Python | |
---|---|---|
Image Recognition | imageNet |
imageNet |
Object Detection | detectNet |
detectNet |
Segmentation | segNet |
segNet |
These libraries are able to be used in external projects by linking to libjetson-inference
and libjetson-utils
.
Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:
Additional C++ and Python samples for running the networks on static images and live camera streams can be found here:
Images | Camera | |
---|---|---|
C++ (examples ) |
||
Image Recognition | imagenet-console |
imagenet-camera |
Object Detection | detectnet-console |
detectnet-camera |
Segmentation | segnet-console |
segnet-camera |
Python (python/examples ) |
||
Image Recognition | imagenet-console.py |
imagenet-camera.py |
Object Detection | detectnet-console.py |
detectnet-camera.py |
Segmentation | segnet-console.py |
segnet-camera.py |
note: for working with numpy arrays, see
cuda-from-numpy.py
andcuda-to-numpy.py
These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help
for usage info.
The project comes with a number of pre-trained models that are available through the Model Downloader tool:
Network | CLI argument | NetworkType enum |
---|---|---|
AlexNet | alexnet |
ALEXNET |
GoogleNet | googlenet |
GOOGLENET |
GoogleNet-12 | googlenet-12 |
GOOGLENET_12 |
ResNet-18 | resnet-18 |
RESNET_18 |
ResNet-50 | resnet-50 |
RESNET_50 |
ResNet-101 | resnet-101 |
RESNET_101 |
ResNet-152 | resnet-152 |
RESNET_152 |
VGG-16 | vgg-16 |
VGG-16 |
VGG-19 | vgg-19 |
VGG-19 |
Inception-v4 | inception-v4 |
INCEPTION_V4 |
Network | CLI argument | NetworkType enum | Object classes |
---|---|---|---|
SSD-Mobilenet-v1 | ssd-mobilenet-v1 |
SSD_MOBILENET_V1 |
91 (COCO classes) |
SSD-Mobilenet-v2 | ssd-mobilenet-v2 |
SSD_MOBILENET_V2 |
91 (COCO classes) |
SSD-Inception-v2 | ssd-inception-v1 |
SSD_INCEPTION_V2 |
91 (COCO classes) |
DetectNet-COCO-Dog | coco-dog |
COCO_DOG |
dogs |
DetectNet-COCO-Bottle | coco-bottle |
COCO_BOTTLE |
bottles |
DetectNet-COCO-Chair | coco-chair |
COCO_CHAIR |
chairs |
DetectNet-COCO-Airplane | coco-airplane |
COCO_AIRPLANE |
airplanes |
ped-100 | pednet |
PEDNET |
pedestrians |
multiped-500 | multiped |
PEDNET_MULTI |
pedestrians, luggage |
facenet-120 | facenet |
FACENET |
faces |
Dataset | Resolution | CLI Argument | Accuracy | Jetson Nano | Jetson Xavier |
---|---|---|---|---|---|
Cityscapes | 512x256 | fcn-resnet18-cityscapes-512x256 |
83.3% | 48 FPS | 480 FPS |
Cityscapes | 1024x512 | fcn-resnet18-cityscapes-1024x512 |
87.3% | 12 FPS | 175 FPS |
Cityscapes | 2048x1024 | fcn-resnet18-cityscapes-2048x1024 |
89.6% | 3 FPS | 47 FPS |
DeepScene | 576x320 | fcn-resnet18-deepscene-576x320 |
96.4% | 26 FPS | 360 FPS |
DeepScene | 864x480 | fcn-resnet18-deepscene-864x480 |
96.9% | 14 FPS | 190 FPS |
Multi-Human | 512x320 | fcn-resnet18-mhp-512x320 |
86.5% | 34 FPS | 370 FPS |
Multi-Human | 640x360 | fcn-resnet18-mhp-512x320 |
87.1% | 23 FPS | 325 FPS |
Pascal VOC | 320x320 | fcn-resnet18-voc-320x320 |
85.9% | 45 FPS | 508 FPS |
Pascal VOC | 512x320 | fcn-resnet18-voc-512x320 |
88.5% | 34 FPS | 375 FPS |
SUN RGB-D | 512x400 | fcn-resnet18-sun-512x400 |
64.3% | 28 FPS | 340 FPS |
SUN RGB-D | 640x512 | fcn-resnet18-sun-640x512 |
65.1% | 17 FPS | 224 FPS |
- If the resolution is omitted from the CLI argument, the lowest resolution model is loaded
- Accuracy indicates the pixel classification accuracy across the model's validation dataset
- Performance is measured for GPU FP16 mode with JetPack 4.2.1,
nvpmodel 0
(MAX-N)
Legacy Segmentation Models
Network | CLI Argument | NetworkType enum | Classes |
---|---|---|---|
Cityscapes (2048x2048) | fcn-alexnet-cityscapes-hd |
FCN_ALEXNET_CITYSCAPES_HD |
21 |
Cityscapes (1024x1024) | fcn-alexnet-cityscapes-sd |
FCN_ALEXNET_CITYSCAPES_SD |
21 |
Pascal VOC (500x356) | fcn-alexnet-pascal-voc |
FCN_ALEXNET_PASCAL_VOC |
21 |
Synthia (CVPR16) | fcn-alexnet-synthia-cvpr |
FCN_ALEXNET_SYNTHIA_CVPR |
14 |
Synthia (Summer-HD) | fcn-alexnet-synthia-summer-hd |
FCN_ALEXNET_SYNTHIA_SUMMER_HD |
14 |
Synthia (Summer-SD) | fcn-alexnet-synthia-summer-sd |
FCN_ALEXNET_SYNTHIA_SUMMER_SD |
14 |
Aerial-FPV (1280x720) | fcn-alexnet-aerial-fpv-720p |
FCN_ALEXNET_AERIAL_FPV_720p |
2 |
Training GPU: Maxwell, Pascal, Volta, or Turing-based GPU (ideally with at least 6GB video memory)
optionally, AWS P2/P3 instance or Microsoft Azure N-series
Ubuntu 16.04/18.04 x86_64
Deployment: Jetson Nano Developer Kit with JetPack 4.2 or newer (Ubuntu 18.04 aarch64).
Jetson Xavier Developer Kit with JetPack 4.0 or newer (Ubuntu 18.04 aarch64)
Jetson TX2 Developer Kit with JetPack 3.0 or newer (Ubuntu 16.04 aarch64).
Jetson TX1 Developer Kit with JetPack 2.3 or newer (Ubuntu 16.04 aarch64).
Note that TensorRT samples from the repo are intended for deployment onboard Jetson, however when cuDNN and TensorRT have been installed on the host side, the TensorRT samples in the repo can be compiled for PC.
In this area, links and resources for deep learning are listed:
- ros_deep_learning - TensorRT inference ROS nodes
- NVIDIA AI IoT - NVIDIA Jetson GitHub repositories
- Jetson eLinux Wiki - Jetson eLinux Wiki
Since the documentation has been re-organized, below are links mapping the previous content to the new locations.
(click on the arrow above to hide this section)See DIGITS Workflow
See DIGITS Setup
See JetPack Setup
See DIGITS Setup
See DIGITS Setup
See DIGITS Setup
See DIGITS Setup
See DIGITS Setup
See DIGITS Setup
See DIGITS Setup
See Building the Repo from Source
See Building the Repo from Source
See Building the Repo from Source
See Building the Repo from Source
See Building the Repo from Source
See Classifying Images with ImageNet
See Classifying Images with ImageNet
See Running the Live Camera Recognition Demo
See Re-Training the Recognition Network
See Re-Training the Recognition Network
See Re-Training the Recognition Network
See Re-Training the Recognition Network
See Re-Training the Recognition Network
See Re-Training the Recognition Network
See Downloading Model Snapshots to Jetson
See Loading Custom Models on Jetson
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Locating Object Coordinates using DetectNet
See Downloading the Detection Model to Jetson
See Downloading the Detection Model to Jetson
See Detecting Objects from the Command Line
See Detecting Objects from the Command Line
See Detecting Objects from the Command Line
See Detecting Objects from the Command Line
See Detecting Objects from the Command Line
See Detecting Objects from the Command Line
See Running the Live Camera Detection Demo
See Semantic Segmentation with SegNet
See Semantic Segmentation with SegNet
See Semantic Segmentation with SegNet
See Generating Pretrained FCN-Alexnet
See Training FCN-Alexnet with DIGITS
See Training FCN-Alexnet with DIGITS
© 2016-2019 NVIDIA | Table of Contents