wang-xinyu
diff --git a/‎CMakeLists.txt
Lines changed: 20 additions & 0 deletions b/‎CMakeLists.txt
Lines changed: 20 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 68 additions & 9 deletions b/‎README.md
Lines changed: 68 additions & 9 deletions
diff --git a/‎docker/README.md
Lines changed: 2 additions & 2 deletions b/‎docker/README.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎docker/tensorrtx-docker-compose.yml
Lines changed: 1 addition & 1 deletion b/‎docker/tensorrtx-docker-compose.yml
Lines changed: 1 addition & 1 deletion
diff --git a/‎docker/x86_64.dockerfile
Lines changed: 5 additions & 2 deletions b/‎docker/x86_64.dockerfile
Lines changed: 5 additions & 2 deletions
diff --git a/‎lenet/CMakeLists.txt
Lines changed: 43 additions & 29 deletions b/‎lenet/CMakeLists.txt
Lines changed: 43 additions & 29 deletions
diff --git a/‎lenet/FindTensorRT.cmake
Lines changed: 79 additions & 0 deletions b/‎lenet/FindTensorRT.cmake
Lines changed: 79 additions & 0 deletions
diff --git a/‎lenet/README.md
Lines changed: 16 additions & 28 deletions b/‎lenet/README.md
Lines changed: 16 additions & 28 deletions
@@ -0,0 +1,20 @@
+cmake_minimum_required(VERSION 3.14)
+
+project(
+  tensorrtx
+  VERSION 0.1
+  LANGUAGES C CXX CUDA)
+
+set(TensorRT_7_8_10_TARGETS mlp lenet)
+
+set(TensorRT_8_TARGETS)
+
+set(TensorRT_10_TARGETS)
+
+set(ALL_TARGETS ${TensorRT_7_8_10_TARGETS} ${TensorRT_8_TARGETS}
+                ${TensorRT_10_TARGETS})
+
+foreach(sub_dir ${ALL_TARGETS})
+  message(STATUS "Add subdirectory: ${sub_dir}")
+  add_subdirectory(${sub_dir})
+endforeach()
@@ -16,7 +16,7 @@ The basic workflow of TensorRTx is:
 ## News
 
 - `22 Oct 2024`. [lindsayshuo](https://github.com/lindsayshuo): YOLOv8-obb
-- `18 Oct 2024`. [zgjja](https://github.com/zgjja): Rafactor docker image.
+- `18 Oct 2024`. [zgjja](https://github.com/zgjja): Refactor docker image.
 - `11 Oct 2024`. [mpj1234](https://github.com/mpj1234): YOLO11
 - `9 Oct 2024`. [Phoenix8215](https://github.com/Phoenix8215): GhostNet V1 and V2.
 - `21 Aug 2024`. [Lemonononon](https://github.com/Lemonononon): real-esrgan-general-x4v3
@@ -38,7 +38,7 @@ The basic workflow of TensorRTx is:
 - [A guide for quickly getting started, taking lenet5 as a demo.](./tutorials/getting_started.md)
 - [The .wts file content format](./tutorials/getting_started.md#the-wts-content-format)
 - [Frequently Asked Questions (FAQ)](./tutorials/faq.md)
-- [Migrating from TensorRT 4 to 7](./tutorials/migrating_from_tensorrt_4_to_7.md)
+- [Migration Guide](./tutorials/migration_guide.md)
 - [How to implement multi-GPU processing, taking YOLOv4 as example](./tutorials/multi_GPU_processing.md)
 - [Check if Your GPU support FP16/INT8](./tutorials/check_fp16_int8_support.md)
 - [How to Compile and Run on Windows](./tutorials/run_on_windows.md)
@@ -47,21 +47,80 @@ The basic workflow of TensorRTx is:
 
 ## Test Environment
 
-1. TensorRT 7.x
-2. TensorRT 8.x(Some of the models support 8.x)
+1. (**NOT recommended**) TensorRT 7.x
+2. (**Recommended**)TensorRT 8.x
+3. (**NOT recommended**) TensorRT 10.x
+
+### Note
+
+1. For history reason, some of the models are limited to specific TensorRT version, please check the README.md or code for the model you want to use.
+2. Currently, TensorRT 8.x has better compatibility and the most of the features  supported.
 
 ## How to run
 
-Each folder has a readme inside, which explains how to run the models inside.
+**Note**: this project support to build each network by the `CMakeLists.txt` in its subfolder, or you can build them together by the `CMakeLists.txt` on top of this project.
+
+* General procedures before building and running:
+
+```bash
+# 1. generate xxx.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet
+# ...
+
+# 2. put xxx.wts on top of this folder
+# ...
+```
+
+* (*Option 1*) To build a single subproject in this project, do:
+
+```bash
+## enter the subfolder
+cd tensorrtx/xxx
+
+## configure & build
+cmake -S . -B build
+make -C build
+```
+
+* (*Option 2*) To build many subprojects, firstly, in the top `CMakeLists.txt`, **uncomment** the project you don't want to build or not suppoted by your TensorRT version, e.g., you cannot build subprojects in `${TensorRT_8_Targets}` if your TensorRT is `7.x`. Then:
+
+```bash
+## enter the top of this project
+cd tensorrtx
+
+## configure & build
+# you may use "Ninja" rather than "make" to significantly boost the build speed
+cmake -G Ninja -S . -B build
+ninja -C build
+```
+
+**WARNING**: This part is still under development, most subprojects are not adapted yet.
+
+* run the generated executable, e.g.:
+
+```bash
+# serialize model to plan file i.e. 'xxx.engine'
+build/xxx -s
+
+# deserialize plan file and run inference
+build/xxx -d
+
+# (Optional) check if the output is same as pytorchx/lenet
+# ...
+
+# (Optional) customize the project
+# ...
+```
+
+For more details, each subfolder may contain a `README.md` inside, which explains more.
 
 ## Models
 
 Following models are implemented.
 
-|Name | Description |
-|-|-|
-|[mlp](./mlp) | the very basic model for starters, properly documented |
-|[lenet](./lenet) | the simplest, as a "hello world" of this project |
+| Name | Description | Supported TensorRT Version |
+|---------------|---------------|---------------|
+|[mlp](./mlp) | the very basic model for starters, properly documented | 7.x/8.x/10.x |
+|[lenet](./lenet) | the simplest, as a "hello world" of this project | 7.x/8.x/10.x |
 |[alexnet](./alexnet)| easy to implement, all layers are supported in tensorrt |
 |[googlenet](./googlenet)| GoogLeNet (Inception v1) |
 |[inception](./inception)| Inception v3, v4 |
 
@@ -49,11 +49,11 @@ Change the `TAG` on top of the `.dockerfile`. Note: all images are officially ow
 
 For more detail of the support matrix, please check [HERE](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)
 
-### How to customize opencv?
+### How to customize the opencv in the image?
 
 If prebuilt package from apt cannot meet your requirements, please refer to the demo code in `.dockerfile` to build opencv from source.
 
-### How to solve image build fail issues?
+### How to solve failiures when building image?
 
 For *443 timeout* or any similar network issues, a proxy may required. To make your host proxy work for building env of docker, please change the `build` node inside docker-compose file like this:
 ```YAML
 
@@ -1,6 +1,6 @@
 services:
   tensorrt:
-    image: tensortx:1.0.0
+    image: tensortx:1.0.1
     container_name: tensortx
     environment:
       - NVIDIA_VISIBLE_DEVICES=all
 
@@ -7,13 +7,16 @@ ENV DEBIAN_FRONTEND noninteractive
 # basic tools
 RUN apt update && apt-get install -y --fix-missing --no-install-recommends \
 sudo wget curl git ca-certificates ninja-build tzdata pkg-config \
-gdb libglib2.0-dev libmount-dev \
+gdb libglib2.0-dev libmount-dev locales \
 && rm -rf /var/lib/apt/lists/*
 RUN pip install --no-cache-dir yapf isort cmake-format pre-commit
 
+## fix a potential pre-commit error
+RUN locale-gen "en_US.UTF-8"
+
 ## override older cmake
 RUN find /usr/local/share -type d -name "cmake-*" -exec rm -rf {} + \
-&& curl -fsSL "https://github.com/Kitware/CMake/releases/download/v3.29.0/cmake-3.29.0-linux-x86_64.sh" \
+&& curl -fsSL "https://github.com/Kitware/CMake/releases/download/v3.30.0/cmake-3.30.0-linux-x86_64.sh" \
 -o cmake.sh && bash cmake.sh --skip-license --exclude-subdir --prefix=/usr/local && rm cmake.sh
 
 RUN apt update && apt-get install -y \
 
@@ -1,29 +1,43 @@
-cmake_minimum_required(VERSION 2.6)
-
-project(lenet)
-
-add_definitions(-std=c++11)
-
-set(TARGET_NAME "lenet")
-
-option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
-set(CMAKE_CXX_STANDARD 11)
-set(CMAKE_BUILD_TYPE Debug)
-
-include_directories(${PROJECT_SOURCE_DIR}/include)
-# include and link dirs of cuda and tensorrt, you need adapt them if yours are different
-# cuda
-include_directories(/usr/local/cuda/include)
-link_directories(/usr/local/cuda/lib64)
-# tensorrt
-include_directories(/usr/include/x86_64-linux-gnu/)
-link_directories(/usr/lib/x86_64-linux-gnu/)
-
-FILE(GLOB SRC_FILES ${PROJECT_SOURCE_DIR}/lenet.cpp ${PROJECT_SOURCE_DIR}/include/*.h)
-
-add_executable(${TARGET_NAME} ${SRC_FILES})
-target_link_libraries(${TARGET_NAME} nvinfer)
-target_link_libraries(${TARGET_NAME} cudart)
-
-add_definitions(-O2 -pthread)
-
+cmake_minimum_required(VERSION 3.17.0)
+
+project(
+  lenet
+  VERSION 0.1
+  LANGUAGES C CXX CUDA)
+
+if(NOT DEFINED CMAKE_CUDA_ARCHITECTURES)
+  set(CMAKE_CUDA_ARCHITECTURES
+      60
+      70
+      72
+      75
+      80
+      86
+      89)
+endif()
+
+set(CMAKE_CXX_STANDARD 17)
+set(CMAKE_CXX_STANDARD_REQUIRED ON)
+set(CMAKE_CUDA_STANDARD 17)
+set(CMAKE_CUDA_STANDARD_REQUIRED ON)
+set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
+set(CMAKE_INCLUDE_CURRENT_DIR TRUE)
+set(CMAKE_BUILD_TYPE
+    "Debug"
+    CACHE STRING "Build type for this project" FORCE)
+
+option(CUDA_USE_STATIC_CUDA_RUNTIME "Use static cudaruntime library" OFF)
+
+find_package(Threads REQUIRED)
+find_package(CUDAToolkit REQUIRED)
+
+if(NOT TARGET TensorRT::TensorRT)
+  include(FindTensorRT.cmake)
+else()
+  message("TensorRT has been found, skipping for ${PROJECT_NAME}")
+endif()
+
+add_executable(${PROJECT_NAME} lenet.cpp)
+
+target_link_libraries(${PROJECT_NAME} PUBLIC Threads::Threads CUDA::cudart
+                                             TensorRT::TensorRT)
@@ -0,0 +1,79 @@
+cmake_minimum_required(VERSION 3.17.0)
+
+set(TRT_VERSION
+    $ENV{TRT_VERSION}
+    CACHE STRING
+          "TensorRT version, e.g. \"8.6.1.6\" or \"8.6.1.6+cuda12.0.1.011\"")
+
+# find TensorRT include folder
+if(NOT TensorRT_INCLUDE_DIR)
+  if(CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
+    set(TensorRT_INCLUDE_DIR
+        "/usr/local/cuda/targets/aarch64-linux/include"
+        CACHE PATH "TensorRT_INCLUDE_DIR")
+  else()
+    set(TensorRT_INCLUDE_DIR
+        "/usr/include/x86_64-linux-gnu"
+        CACHE PATH "TensorRT_INCLUDE_DIR")
+  endif()
+  message(STATUS "TensorRT: ${TensorRT_INCLUDE_DIR}")
+endif()
+
+# find TensorRT library folder
+if(NOT TensorRT_LIBRARY_DIR)
+  if(CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
+    set(TensorRT_LIBRARY_DIR
+        "/usr/lib/aarch64-linux-gnu/tegra"
+        CACHE PATH "TensorRT_LIBRARY_DIR")
+  else()
+    set(TensorRT_LIBRARY_DIR
+        "/usr/include/x86_64-linux-gnu"
+        CACHE PATH "TensorRT_LIBRARY_DIR")
+  endif()
+  message(STATUS "TensorRT: ${TensorRT_LIBRARY_DIR}")
+endif()
+
+set(TensorRT_LIBRARIES)
+
+message(STATUS "Found TensorRT lib: ${TensorRT_LIBRARIES}")
+
+# process for different TensorRT version
+if(DEFINED TRT_VERSION AND NOT TRT_VERSION STREQUAL "")
+  string(REGEX MATCH "([0-9]+)" _match ${TRT_VERSION})
+  set(TRT_MAJOR_VERSION "${_match}")
+  set(_modules nvinfer nvinfer_plugin)
+  unset(_match)
+
+  if(TRT_MAJOR_VERSION GREATER_EQUAL 8)
+    list(APPEND _modules nvinfer_vc_plugin nvinfer_dispatch nvinfer_lean)
+  endif()
+else()
+  message(FATAL_ERROR "Please set a environment variable \"TRT_VERSION\"")
+endif()
+
+# find and add all modules of TensorRT into list
+foreach(lib IN LISTS _modules)
+  find_library(
+    TensorRT_${lib}_LIBRARY
+    NAMES ${lib}
+    HINTS ${TensorRT_LIBRARY_DIR})
+  list(APPEND TensorRT_LIBRARIES ${TensorRT_${lib}_LIBRARY})
+endforeach()
+
+# make the "TensorRT target"
+add_library(TensorRT IMPORTED INTERFACE)
+add_library(TensorRT::TensorRT ALIAS TensorRT)
+target_link_libraries(TensorRT INTERFACE ${TensorRT_LIBRARIES})
+
+set_target_properties(
+  TensorRT
+  PROPERTIES C_STANDARD 17
+             CXX_STANDARD 17
+             POSITION_INDEPENDENT_CODE ON
+             SKIP_BUILD_RPATH TRUE
+             BUILD_WITH_INSTALL_RPATH TRUE
+             INSTALL_RPATH "$\{ORIGIN\}"
+             INTERFACE_INCLUDE_DIRECTORIES "${TensorRT_INCLUDE_DIR}")
+
+unset(TRT_MAJOR_VERSION)
+unset(_modules)
@@ -1,36 +1,22 @@
 # lenet5
 
-lenet5 is the simplest net in this tensorrtx project. You can learn the basic procedures of building tensorrt app from API. Including `define network`, `build engine`, `set output`, `do inference`, `serialize model to file`, `deserialize model from file`, etc.
+lenet5 is one of the simplest net in this repo. You can learn the basic procedures of building CNN from TensorRT API. This demo includes 2 major steps:
 
-## TensorRT C++ API
-
-```
-// 1. generate lenet5.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet
-
-// 2. put lenet5.wts into tensorrtx/lenet
-
-// 3. build and run
-
-cd tensorrtx/lenet
-
-mkdir build
+1. Build engine
+    * define network
+    * set input/output
+    * serialize model to `.engine` file
+2. Do inference
+    * load and deserialize model from `.engine` file
+    * run inference
 
-cd build
-
-cmake ..
-
-make
-
-sudo ./lenet -s   // serialize model to plan file i.e. 'lenet5.engine'
-
-sudo ./lenet -d   // deserialize plan file and run inference
+## TensorRT C++ API
 
-// 4. see if the output is same as pytorchx/lenet
-```
+see [HERE](../README.md#how-to-run)
 
 ## TensorRT Python API
 
-```
+```bash
 # 1. generate lenet5.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet
 
 # 2. put lenet5.wts into tensorrtx/lenet
@@ -39,9 +25,11 @@ sudo ./lenet -d   // deserialize plan file and run inference
 
 cd tensorrtx/lenet
 
-python lenet.py -s   # serialize model to plan file, i.e. 'lenet5.engine'
+# 4.1 serialize model to plan file, i.e. 'lenet5.engine'
+python lenet.py -s
 
-python lenet.py -d   # deserialize plan file and run inference
+# 4.2 deserialize plan file and run inference
+python lenet.py -d
 
-# 4. see if the output is same as pytorchx/lenet
+# 5. (Optional) see if the output is same as pytorchx/lenet
 ```