Examples of TensorRT models using ONNX

All useful sample codes of TensorRT models using ONNX

0. Development Environment

RTX3060 (notebook)
WSL
Ubuntu 22.04.5 LTS
cuda 12.8

conda deactivate conda env remove -n trte -y

conda create -n trte python=3.11 --yes 
conda activate trte

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu129
pip install cuda-python==12.9.2
pip install tensorrt-cu12
pip install onnx
pip install opencv-python
pip install timm
pip install matplotlib

pip install -U "nvidia-modelopt[all]"

# Check installation 
python -c "import modelopt; print(modelopt.__version__)"
python -c "import modelopt.torch.quantization.extensions as ext; ext.precompile()"

1. Basic step

Generation TensorRT Model by using ONNX
1.1 TensorRT CPP API
1.2 TensorRT Python API
1.3 Polygraphy
Dynamic shapes for TensorRT
2.1 Dynamic batch
2.2 Dynamic input size

2. Intermediate step

Custom Plugin
3.1 Adding a pre-processing layer by cuda
Modifying an ONNX graph by ONNX GraphSurgeon
4.1 Extracting a feature map of the last Conv for Grad-Cam
4.2 Generating a TensorRT model with a custom plugin and ONNX
TensorRT Model Optimizer
5.0 Train Base Model (resnet18)
5.1 Base TensorRT (fp16)
5.2 Explict Quantization (PTQ)
5.3 Explict Quantization (QAT)
5.4 Explict Quantization (ONNX PTQ)
5.5 Implicit Quantization (TensorRT PTQ)
5.6 Sparsity (2:4 sparsity)
5.7 Pruning
5.8 NAS(Neural Architecture Search)
5.9 Multiple Optimizations Techniques
5.9.1 (Pruning + Sparsity)
5.9.2 (Pruning + Sparsity + Quantization(QAT))
5.9.3 (NAS + Sparsity)
5.9.4 (NAS + Sparsity + Quantization(QAT))

Framework	PyTorch	TensorRT	TensorRT	TensorRT	TensorRT	TensorRT	TensorRT
Opti Technique	-	-	onnx ptq	tmo ptq	tmo qat	tmo sparsity	tmo pruning (flops 80%)
Precision	fp16	fp16	int8	int8	int8	fp16	fp16
Top-1 Acc [%]	84.58	84.54	84.5	84.2	84.42	83.28	82.76
Top-5 Acc [%]	97.2	97.2	97	97.06	97.1	96.72	96.42
FPS [Frame/sec]	406.27	1463.45	1897.46	1542.34	1572.81	1483.85	1573.2
Avg Latency [ms]	2.46	0.68	0.53	0.65	0.64	0.67	0.64
GPU Mem [MB]	286	138	124	124	138	138	130

3. Advanced step

Super Resolution
6.1 Real-ESRGAN
Object Detection
7.1 yolo11
Instance Segmentation
Semantic Segmentation
Depth Estimation
10.1 Depth Pro

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
custom_layer		custom_layer
depth_estimation_trt		depth_estimation_trt
dynamic_batch_trt		dynamic_batch_trt
dynamic_input_size_trt		dynamic_input_size_trt
gradcam_trt		gradcam_trt
object_detection1		object_detection1
super_resolution_trt		super_resolution_trt
timm_to_trt_cpp		timm_to_trt_cpp
timm_to_trt_python1		timm_to_trt_python1
timm_to_trt_python2		timm_to_trt_python2
tmo		tmo
trt_quantization		trt_quantization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
common.py		common.py
common_runtime.py		common_runtime.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Examples of TensorRT models using ONNX

0. Development Environment

1. Basic step

2. Intermediate step

3. Advanced step

4. reference

About

Uh oh!

Releases

Packages

Languages

License

yester31/TensorRT_Examples

Folders and files

Latest commit

History

Repository files navigation

Examples of TensorRT models using ONNX

0. Development Environment

1. Basic step

2. Intermediate step

3. Advanced step

4. reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages