RangeNet-TensorRT

🎉 This project has been a pleasure, allowing me to repay technical debt, learn how to locate bugs during model deployment, gain experience with GitHub Actions, and explore CUDA programming. I greatly appreciate the valuable feedback from others that has contributed to improving the project. I hope that this project will be of use to you.

English | 简体中文

1. Purpose

Use more newer dependencies and APIs. Specifically, we deploy the RangeNet repository in an environment with TensorRT 8+, Ubuntu 20.04+, remove Boost dependency, manage TensorRT objects and GPU memory with smart pointers, and provide ROS demo.
Faster Performance. Resolve the issue of reduced segmentation accuracy when using FP16 (issue#9), achieving a significant speed boost without sacrificing accuracy. Preprocess data using CUDA. Perform KNN post-processing with libtorch ( refer to here).

2. Installation

2.1 Docker installation

We provide a Docker installation, please see more in docker/README.md

2.2 Source installation

Step 1: Download and Extract libtorch

Note

Using the Torch library from Conda was observed to slow down the post-processing stage from 6 ms to 30 ms.

$ wget -c https://download.pytorch.org/libtorch/cu113/libtorch-cxx11-abi-shared-with-deps-1.10.2%2Bcu113.zip -O libtorch.zip
$ unzip libtorch.zip

Step 2: Set up the deep learning environment (install NVIDIA driver, CUDA, TensorRT, cuDNN). The tested configurations are listed below. At least 3000 MB of GPU memory is required.

Ubuntu	GPU	TensorRT	CUDA	cuDNN	—
20.04	TITAN RTX	8.2.3	CUDA 11.4.r11.4	cuDNN 8.2.4	✔️
20.04	NVIDIA GeForce RTX 3060	8.4.1.5	CUDA 11.3.r11.3	cuDNN 8.0.5	✔️
20.04	NVIDIA GeForce RTX 3060 NVIDIA GeForce RTX 4070	10.6.0.26	CUDA 11.1.105	cuDNN 8.0.5.39	✔️
20.04	NVIDIA GeForce RTX 3060 NVIDIA GeForce RTX 4070	10.6.0.26	CUDA 12.4.r12.4	cuDNN 9.1.0.70-1	✔️
22.04	NVIDIA GeForce RTX 3060	8.2.5.1	CUDA 11.3.r11.3	cuDNN 8.8.0	✔️
22.04	NVIDIA GeForce RTX 3060	8.4.1.5	CUDA 11.3.r11.3	cuDNN 8.8.0	✔️
22.04	NVIDIA GeForce RTX 3060	8.4.3.1	CUDA 11.3.r11.3	cuDNN 8.8.0	✔️
22.04	NVIDIA GeForce RTX 3060	8.6.1.6	CUDA 11.3.r11.3	cuDNN 8.8.0	✔️
22.04	NVIDIA GeForce RTX 3060	10.6.0.26	CUDA 11.3.r11.3	cuDNN 8.8.0	✔️

Note

You must choose the appropriate version of CUDA based on your Compute Capability. For example, if your want to use Compute Capability 89, you must choose CUDA 11.8+.

You can see Compute Capability in https://developer.nvidia.com/cuda-gpus#compute.

GPU Hardware Architecture	Compute Capability	Relevant GPUs	Minimum CUDA Version
Ampere Architecture	86	RTX 3060，RTX3070，RTX 3080，RTX 3090	CUDA 11.1
Ada Lovelace Architecture	89	RTX 4090, RTX 4080	CUDA 11.8

Note

You must choose the appropriate version of CUDA based on your nvidia-driver.

nvidia-driver Version	Maximum CUDA Version
545	CUDA 12.3
550	CUDA 12.4

Add the following environment variables to ~/.bashrc:

# Example configuration:

# >>> Deep Learning Configuration >>>
# Import CUDA environment
CUDA_PATH=/usr/local/cuda/bin
CUDA_LIB_PATH=/usr/local/cuda/lib64

# Import TensorRT environment
export TENSORRT_DIR=${HOME}/Application/TensorRT-8.4.1.5/
TENSORRT_PATH=${TENSORRT_DIR}/bin
TENSORRT_LIB_PATH=${TENSORRT_DIR}/lib

# Import libtorch environment
export Torch_DIR=${HOME}/Application/libtorch/share/cmake/Torch

export PATH=${PATH}:${CUDA_PATH}:${TENSORRT_PATH}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${CUDA_LIB_PATH}:${TENSORRT_LIB_PATH}

Step 3: (Optional, if ROS components are needed). Please install ROS1 (Noetic) or ROS2 (Humble).

# Install ROS
$ ...
# Install extra dependency
$ sudo apt install ros-${ROS_DISTRO}-pcl-ros

Step 4: Install apt-related and Python packages

$ sudo apt install build-essential python3-dev python3-pip apt-utils git cmake libboost-all-dev libyaml-cpp-dev libopencv-dev python3-empy libfmt-dev
$ pip install catkin_tools trollius numpy

Step 5: Clone the Repository

$ git clone https://github.com/Natsu-Akatsuki/RangeNet-TensorRT ~/rangenet/src/rangenet/

Step 6: Import model files and datasets.

# Download model files
$ wget -c https://github.com/Natsu-Akatsuki/RangeNet-TensorRT/releases/download/v0.0.0-alpha/model.onnx -O ~/rangenet/src/rangenet/model/model.onnx

Download datasets: see Baidu Cloud.

Directory Structure

.
├── model
│   ├── arch_cfg.yaml
│   ├── data_cfg.yaml
│   └── model.onnx
├── data
└── ├── 000000.pcd
    ├── kitti_2011_09_30_drive_0027_synced
    └── kitti_2011_09_30_drive_0027_synced.bag

3. Usage

Note

The first run may take some time to generate the TensorRT optimized engine.

Note

Since we use set(CMAKE_CUDA_STANDARD 17), a feature introduced in CMake 3.18, it requires at least version 3.18. Unfortunately, the default CMake version in Ubuntu 20.04 is 3.16.3. Therefore, we provide a workaround to use a higher version of CMake with minimal effort.

$ pip3 install --user cmake==3.18
$ echo 'export PATH=${HOME}/.local/bin:${PATH}' >> ~/.bashrc

🔧 Usage 1： Run data in ROS1 or ROS2

# >>> ROS1 >>>
$ cd ~/rangetnet/
# USE -Wno-dev to suppress PCL WARNING
$ catkin build --cmake-args -Wno-dev
$ source devel/setup.bash
$ roslaunch rangenet_pp ros1_rangenet.launch
$ roslaunch rangenet_pp ros1_bag.launch

# >>> ROS2 >>>
$ cd ~/rangetnet/
$ colcon build --symlink-install
$ source install/setup.bash
$ ros2 launch rangenet_pp ros2_rangenet.launch
$ ros2 launch rangenet_pp ros2_bag.launch

🔧 Usage 2： Predict single-frame point clouds (PCD format)

[!note] PCD point cloud fields must be xyzi, and the intensity field should be normalized (0-1).

# Modify the parameters in config/infer.yaml
$ cd ~/rangenet/src/rangenet/
$ mkdir build
$ cd build

# To display inference time: cmake -DPERFORMANCE_LOG=ON .. && make
$ unset ROS_VERSION && cmake -Wno-dev .. && make -j4
$ ./demo

Step	Time
Preprocessing	1.51363 ms
Inference	21.8513 ms
Postprocessing	4.98176 ms

4. FAQ

❓ Issue 1: [libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1:

The ONNX model is incomplete. Please Re-download the model.

❓ Issue 2: Segmentation fault [Process finished with exit code 139 (interrupted by signal 11:SIGSEGV)] when visualizing single point cloud frames in Ubuntu 22.04 using PCL.

Use PCL library version 1.13.0+. Please provide variable PCL_DIR in cmake/ThirdParty.cmake. See more in Here.

Roadmap

Test ROS1 demo
Resolve issue#8 (2023.07.01)
Add English documentation (2024.11.19)
Explain why using FP16 leads to precision degradation [See more in Here] (2024.11.28)
Provide a Docker environment (2024.11.30)
Add Pybind11 implementation
Resolve non-reproducibility
Refactor code to follow coding standards and improve readability

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github/workflows		.github/workflows
assets		assets
cmake		cmake
config		config
data		data
docker		docker
docs		docs
include		include
launch		launch
model		model
script		script
src		src
.clang-tidy		.clang-tidy
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
README_cn.md		README_cn.md
package.xml		package.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RangeNet-TensorRT

1. Purpose

2. Installation

2.1 Docker installation

2.2 Source installation

3. Usage

4. FAQ

Roadmap

About

Releases 1

Contributors 5

Languages

License

Natsu-Akatsuki/RangeNet-TensorRT

Folders and files

Latest commit

History

Repository files navigation

RangeNet-TensorRT

1. Purpose

2. Installation

2.1 Docker installation

2.2 Source installation

3. Usage

4. FAQ

Roadmap

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Contributors 5

Languages