Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ghostnet && Fix object destruction order in APIToModel function to avoid undefined behavior #1581

Merged
merged 8 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions ghostnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# GhostNet

GhostNetv1 architecture is from the paper "GhostNet: More Features from Cheap Operations" [(https://arxiv.org/abs/1911.11907)](https://arxiv.org/abs/1911.11907).
GhostNetv2 architecture is from the paper "GhostNetV2: Enhance Cheap Operation with Long-Range Attention" [(https://arxiv.org/abs/2211.12905)](https://arxiv.org/abs/2211.12905).

For the PyTorch implementations, you can refer to [huawei-noah/ghostnet](https://github.com/huawei-noah/ghostnet).

Both versions use the following techniques in their TensorRT implementations:

- **BatchNorm** layer is implemented by TensorRT's **Scale** layer.
- **Ghost Modules** are used to generate more features from cheap operations, as described in the paper.
- Replacing `IPoolingLayer` with `IReduceLayer` in TensorRT for Global Average Pooling. The `IReduceLayer` allows you to perform reduction operations (such as sum, average, max) over specified dimensions without being constrained by the kernel size limitations of pooling layers.

## Project Structure

```plaintext
ghostnet
├── ghostnetv1
│ ├── CMakeLists.txt
│ ├── gen_wts.py
│ ├── ghostnetv1.cpp
│ └── logging.h
├── ghostnetv2
│ ├── CMakeLists.txt
│ ├── gen_wts.py
│ ├── ghostnetv2.cpp
│ └── logging.h
└── README.md
```

## Steps to use GhostNet in TensorRT

### 1. Generate `.wts` files for both GhostNetv1 and GhostNetv2

```bash
# For ghostnetv1
python ghostnetv1/gen_wts.py

# For ghostnetv2
python ghostnetv2/gen_wts.py
```

### 2. Build the project

```bash
cd tensorrtx/ghostnet
mkdir build
cd build
cmake ..
make
```

### 3. Serialize the models to engine files

Use the following commands to serialize the PyTorch models into TensorRT engine files (`ghostnetv1.engine` and `ghostnetv2.engine`):

```bash
# For ghostnetv1
sudo ./ghostnetv1 -s

# For ghostnetv2
sudo ./ghostnetv2 -s
```

### 4. Run inference using the engine files

Once the engine files are generated, you can run inference with the following commands:

```bash
# For ghostnetv1
sudo ./ghostnetv1 -d

# For ghostnetv2
sudo ./ghostnetv2 -d
```

### 5. Verify output

Compare the output with the PyTorch implementation from [huawei-noah/ghostnet](https://github.com/huawei-noah/ghostnet) to ensure that the TensorRT results are consistent with the PyTorch model.
24 changes: 24 additions & 0 deletions ghostnet/ghostnetv1/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
cmake_minimum_required(VERSION 2.6)

project(ghostnetv1)

add_definitions(-std=c++11)

option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_BUILD_TYPE Debug)

include_directories(${PROJECT_SOURCE_DIR}/include)
# include and link dirs of cuda and tensorrt, you need adapt them if yours are different
# cuda
include_directories(/usr/local/cuda/include)
link_directories(/usr/local/cuda/lib64)
# tensorrt
include_directories(/usr/include/x86_64-linux-gnu/)
link_directories(/usr/lib/x86_64-linux-gnu/)

add_executable(ghostnetv1 ${PROJECT_SOURCE_DIR}/ghostnetv1.cpp)
target_link_libraries(ghostnetv1 nvinfer)
target_link_libraries(ghostnetv1 cudart)

add_definitions(-O2 -pthread)
Loading