ONNXRuntime backend (WIP) #341

isty2e · 2020-11-08T06:01:10Z

Introduction

This backend enables KataGo to run network files in .onnx format, which is an open standard for exchanging neural networks, with ONNXRuntime backend. Currently vesion 8 networks can be converted to .onnx by https://github.com/isty2e/KataGoONNX, and converted network files are available here.

The main motivation for running .onnx models is that new architectures or network details can be tested easily by exporting any trained model to .onnx, without implementing every detail in CUDA/OpenCL backend. For the time being, there is no advantage of using this backend for normal users: It will be slower by 10-20% than CUDA/OpenCL version for 19x19 boards, though it can be faster for smaller boards.

Execution providers

Currently, four execution providers are available for this backend:

CUDA
TensorRT - It is dreadfully slow for unknown reasons
DirectML
MIGraphX - I couldn't really test it, and there can be problems building

For windows systems, DirectML is considered the best in general. AMD cards can make use of MIGraphX, and ROCm execution provider can be supported once it is fully integrated in ONNXRuntime.

Building

GCC-9 or above is recommended for Linux systems.

First of all, you need to download ONNXRuntime binary, or build ONNXRuntime by yourself. Considering that there is no merit of using TensorRT at this point, you can just download the binary.
Then build KataGo as per usual, but with additional CMake flags:
- ORT_LIB_DIR: ONNXRuntime library location
- ORT_INCLUDE_DIR: ONNXRuntime header file location
- ORT_CUDA, ORT_TENSORRT, ORT_DIRECTML, ORT_MIGRAPHX: Whether to support specific execution providers
- TENSORRT_LIB_DIR, TENSORRT_INCLUDE_DIR: Library and header file locations for TensorRT
  For example, if you want to build ONNXRuntime with CUDA and DirectML support, then your CMake configuration will be like this:

cmake -DUSE_BACKEND=ONNXRUNTIME -DORT_CUDA=1 -DORT_DIRECTML=1 -DORT_LIB_DIR=/foo/bar/lib -DORT_INCLUDE_DIR=/foo/bar/include

Configuration

There are be two options in .cfg file for ONNXRuntime backend: onnxOptModelFile and onnxRuntimeExecutionProvider. onnxOptModelFile is the path for a cached graph-optimized onnx file. onnxRuntimeExecutionProvider is one of the execution providers - CUDA, TensorRT, DirectML, or MIGraphX. These options can be properly set by running genconfig.

For FP16 inference, you will need to use FP16 models instead of normal FP32 models. There is no advantage of FP16 inference for non-RTX cards, but you will want to use FP16 models instead for RTX cards.

TODO

Somehow verify that MIGraphX version compiles and runs
Maybe a cleaner CMakeLists.txt
Figure out why TensorRT execution provider is so slow
Improve code quality in general
For some cases, it might be the case that visit/s is not properly measured: NPS was constantly increasing so maybe initialization time should be compensated
There can be some duplicated or unnecessary operations, so reduce overhead if any

lp200 · 2020-12-25T13:29:30Z

TensorRT provider seems to reach normal speed after a few minutes of running (for optimization).
I used TensorRT enabled onnxruntime.
and delete SetOptimizedModelFilePath

I used the following libraries and options

CUDA 11.0
cuDNN 8.0.5
TensorRT-7.2.1.6

ORT_TENSORRT_ENGINE_CACHE_ENABLE=1
ORT_TENSORRT_FP16_ENABLE=1

ailuoku6 · 2024-07-24T09:31:48Z

isty2e

Hi, the onnx network file is deleted, could you upload it again? I'm looking forward to run katago in webpage by useing webonnxruntime, think you very much!

Implement ONNXRuntime backend

e94dfac

isty2e force-pushed the onnxruntime branch from 6bc4bd9 to e94dfac Compare November 9, 2020 05:19

resolve conflict

34bb925

isty2e mentioned this pull request Nov 9, 2020

Android binary #321

Open

isty2e mentioned this pull request Dec 1, 2022

Hope to support ONNX Runtime (Training version & Inferencing version) and DirectML #707

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNXRuntime backend (WIP) #341

ONNXRuntime backend (WIP) #341

isty2e commented Nov 8, 2020

lp200 commented Dec 25, 2020

ailuoku6 commented Jul 24, 2024

ONNXRuntime backend (WIP) #341

Are you sure you want to change the base?

ONNXRuntime backend (WIP) #341

Conversation

isty2e commented Nov 8, 2020

Introduction

Execution providers

Building

Configuration

TODO

lp200 commented Dec 25, 2020

ailuoku6 commented Jul 24, 2024