Deploying Deep Learning

Welcome to our instructional guide for inference and realtime DNN vision library for NVIDIA Jetson Nano/TX1/TX2/Xavier.

This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.

Vision primitives, such as imageNet for image recognition, detectNet for object localization, and segNet for semantic segmentation, inherit from the shared tensorNet object. Examples are provided for streaming from live camera feed and processing images. See the API Reference section for detailed reference documentation of the C++ and Python libraries.

There are multiple tracks of the tutorial that you can choose to follow, including Hello AI World for running inference and transfer learning onboard your Jetson, or the full Two Days to a Demo tutorial for training on a PC or server with DIGITS.

It's recommended to walk through the Hello AI World module first to familiarize yourself with machine learning and inference with TensorRT, before proceeding to training in the cloud with DIGITS.

Hello AI World

Hello AI World can be run completely onboard your Jetson, including inferencing with TensorRT and transfer learning with PyTorch. The inference portion of Hello AI World - which includes coding your own image classification application for C++ or Python, object detection, and live camera demos - can be run on your Jetson in roughly two hours or less, while transfer learning is best left to leave running overnight.

Two Days to a Demo (DIGITS)

The full tutorial includes training in the cloud or PC, and inference on the Jetson with TensorRT, and can take roughly two days or more depending on system setup, downloading the datasets, and the training speed of your GPU.

API Reference

Below are links to reference documentation for the C++ and Python libraries from the repo:

jetson-inference

	C++	Python
Image Recognition	`imageNet`	`imageNet`
Object Detection	`detectNet`	`detectNet`
Segmentation	`segNet`	`segNet`

jetson-utils

C++
Python

These libraries are able to be used in external projects by linking to libjetson-inference and libjetson-utils.

Code Examples

Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:

Additional C++ and Python samples for running the networks on static images and live camera streams can be found here:

	Images	Camera
C++ (`examples`)
Image Recognition	`imagenet-console`	`imagenet-camera`
Object Detection	`detectnet-console`	`detectnet-camera`
Segmentation	`segnet-console`	`segnet-camera`
Python (`python/examples`)
Image Recognition	`imagenet-console.py`	`imagenet-camera.py`
Object Detection	`detectnet-console.py`	`detectnet-camera.py`
Segmentation	`segnet-console.py`	`segnet-camera.py`

note: for working with numpy arrays, see cuda-from-numpy.py and cuda-to-numpy.py

These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help for usage info.

Pre-Trained Models

The project comes with a number of pre-trained models that are available through the Model Downloader tool:

Image Recognition

Network	CLI argument	NetworkType enum
AlexNet	`alexnet`	`ALEXNET`
GoogleNet	`googlenet`	`GOOGLENET`
GoogleNet-12	`googlenet-12`	`GOOGLENET_12`
ResNet-18	`resnet-18`	`RESNET_18`
ResNet-50	`resnet-50`	`RESNET_50`
ResNet-101	`resnet-101`	`RESNET_101`
ResNet-152	`resnet-152`	`RESNET_152`
VGG-16	`vgg-16`	`VGG-16`
VGG-19	`vgg-19`	`VGG-19`
Inception-v4	`inception-v4`	`INCEPTION_V4`

Object Detection

Network	CLI argument	NetworkType enum	Object classes
SSD-Mobilenet-v1	`ssd-mobilenet-v1`	`SSD_MOBILENET_V1`	91 (COCO classes)
SSD-Mobilenet-v2	`ssd-mobilenet-v2`	`SSD_MOBILENET_V2`	91 (COCO classes)
SSD-Inception-v2	`ssd-inception-v1`	`SSD_INCEPTION_V2`	91 (COCO classes)
DetectNet-COCO-Dog	`coco-dog`	`COCO_DOG`	dogs
DetectNet-COCO-Bottle	`coco-bottle`	`COCO_BOTTLE`	bottles
DetectNet-COCO-Chair	`coco-chair`	`COCO_CHAIR`	chairs
DetectNet-COCO-Airplane	`coco-airplane`	`COCO_AIRPLANE`	airplanes
ped-100	`pednet`	`PEDNET`	pedestrians
multiped-500	`multiped`	`PEDNET_MULTI`	pedestrians, luggage
facenet-120	`facenet`	`FACENET`	faces

Semantic Segmentation

Dataset	Resolution	CLI Argument	Accuracy	Jetson Nano	Jetson Xavier
Cityscapes	512x256	`fcn-resnet18-cityscapes-512x256`	83.3%	48 FPS	480 FPS
Cityscapes	1024x512	`fcn-resnet18-cityscapes-1024x512`	87.3%	12 FPS	175 FPS
Cityscapes	2048x1024	`fcn-resnet18-cityscapes-2048x1024`	89.6%	3 FPS	47 FPS
DeepScene	576x320	`fcn-resnet18-deepscene-576x320`	96.4%	26 FPS	360 FPS
DeepScene	864x480	`fcn-resnet18-deepscene-864x480`	96.9%	14 FPS	190 FPS
Multi-Human	512x320	`fcn-resnet18-mhp-512x320`	86.5%	34 FPS	370 FPS
Multi-Human	640x360	`fcn-resnet18-mhp-512x320`	87.1%	23 FPS	325 FPS
Pascal VOC	320x320	`fcn-resnet18-voc-320x320`	85.9%	45 FPS	508 FPS
Pascal VOC	512x320	`fcn-resnet18-voc-512x320`	88.5%	34 FPS	375 FPS
SUN RGB-D	512x400	`fcn-resnet18-sun-512x400`	64.3%	28 FPS	340 FPS
SUN RGB-D	640x512	`fcn-resnet18-sun-640x512`	65.1%	17 FPS	224 FPS

If the resolution is omitted from the CLI argument, the lowest resolution model is loaded
Accuracy indicates the pixel classification accuracy across the model's validation dataset
Performance is measured for GPU FP16 mode with JetPack 4.2.1, nvpmodel 0 (MAX-N)

Legacy Segmentation Models

Network	CLI Argument	NetworkType enum	Classes
Cityscapes (2048x2048)	`fcn-alexnet-cityscapes-hd`	`FCN_ALEXNET_CITYSCAPES_HD`	21
Cityscapes (1024x1024)	`fcn-alexnet-cityscapes-sd`	`FCN_ALEXNET_CITYSCAPES_SD`	21
Pascal VOC (500x356)	`fcn-alexnet-pascal-voc`	`FCN_ALEXNET_PASCAL_VOC`	21
Synthia (CVPR16)	`fcn-alexnet-synthia-cvpr`	`FCN_ALEXNET_SYNTHIA_CVPR`	14
Synthia (Summer-HD)	`fcn-alexnet-synthia-summer-hd`	`FCN_ALEXNET_SYNTHIA_SUMMER_HD`	14
Synthia (Summer-SD)	`fcn-alexnet-synthia-summer-sd`	`FCN_ALEXNET_SYNTHIA_SUMMER_SD`	14
Aerial-FPV (1280x720)	`fcn-alexnet-aerial-fpv-720p`	`FCN_ALEXNET_AERIAL_FPV_720p`	2

Recommended System Requirements

Training GPU: Maxwell, Pascal, Volta, or Turing-based GPU (ideally with at least 6GB video memory)
optionally, AWS P2/P3 instance or Microsoft Azure N-series
Ubuntu 16.04/18.04 x86_64

Deployment:   Jetson Nano Developer Kit with JetPack 4.2 or newer (Ubuntu 18.04 aarch64).
                        Jetson Xavier Developer Kit with JetPack 4.0 or newer (Ubuntu 18.04 aarch64)
                        Jetson TX2 Developer Kit with JetPack 3.0 or newer (Ubuntu 16.04 aarch64).
                        Jetson TX1 Developer Kit with JetPack 2.3 or newer (Ubuntu 16.04 aarch64).

Note that TensorRT samples from the repo are intended for deployment onboard Jetson, however when cuDNN and TensorRT have been installed on the host side, the TensorRT samples in the repo can be compiled for PC.

Extra Resources

In this area, links and resources for deep learning are listed:

ros_deep_learning - TensorRT inference ROS nodes
NVIDIA AI IoT - NVIDIA Jetson GitHub repositories
Jetson eLinux Wiki - Jetson eLinux Wiki

Legacy Links

Since the documentation has been re-organized, below are links mapping the previous content to the new locations.

(click on the arrow above to hide this section)

Name		Name	Last commit message	Last commit date
Latest commit History 898 Commits
c		c
calibration		calibration
data		data
docs		docs
examples		examples
plugins		plugins
python		python
tools		tools
utils @ 798c416		utils @ 798c416
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
CMakePreBuild.sh		CMakePreBuild.sh
LICENSE.md		LICENSE.md
README.md		README.md

License

albertfaromatics/jetson-inference

Folders and files

Latest commit

History

Repository files navigation

Deploying Deep Learning

Table of Contents

Hello AI World

Two Days to a Demo (DIGITS)

API Reference

jetson-inference

jetson-utils

Code Examples

Pre-Trained Models

Image Recognition

Object Detection

Semantic Segmentation

Recommended System Requirements

Extra Resources

Legacy Links

DIGITS Workflow

System Setup

Running JetPack on the Host

Installing Ubuntu on the Host

Setting up host training PC with NGC container

Installing the NVIDIA driver

Installing Docker

NGC Sign-up

Setting up data and job directories

Starting DIGITS container

Natively setting up DIGITS on the Host

Installing NVIDIA Driver on the Host

Installing cuDNN on the Host

Installing NVcaffe on the Host

Installing DIGITS on the Host

Starting the DIGITS Server

Building from Source on Jetson

Cloning the Repo

Configuring with CMake

Compiling the Project

Digging Into the Code

Classifying Images with ImageNet

Using the Console Program on Jetson

Running the Live Camera Recognition Demo

Re-training the Network with DIGITS

Downloading Image Recognition Dataset

Customizing the Object Classes

Importing Classification Dataset into DIGITS

Creating Image Classification Model with DIGITS

Testing Classification Model in DIGITS

Downloading Model Snapshot to Jetson

Loading Custom Models on Jetson

Locating Object Coordinates using DetectNet

Detection Data Formatting in DIGITS

Downloading the Detection Dataset

Importing the Detection Dataset into DIGITS

Creating DetectNet Model with DIGITS

Selecting DetectNet Batch Size

Specifying the DetectNet Prototxt

Training the Model with Pretrained Googlenet

Testing DetectNet Model Inference in DIGITS

Downloading the Model Snapshot to Jetson

DetectNet Patches for TensorRT

Processing Images from the Command Line on Jetson

Launching With a Pretrained Model

Pretrained DetectNet Models Available

Running Other MS-COCO Models on Jetson

Running Pedestrian Models on Jetson

Multi-class Object Detection Models

Running the Live Camera Detection Demo on Jetson

Image Segmentation with SegNet

Downloading Aerial Drone Dataset

Importing the Aerial Dataset into DIGITS

Generating Pretrained FCN-Alexnet

Training FCN-Alexnet with DIGITS

Testing Inference Model in DIGITS

FCN-Alexnet Patches for TensorRT

Running Segmentation Models on Jetson

About

Packages