Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
This backend enables KataGo to run network files in .onnx format, which is an open standard for exchanging neural networks, with ONNXRuntime backend. Currently vesion 8 networks can be converted to .onnx by https://github.com/isty2e/KataGoONNX, and converted network files are available here.
The main motivation for running .onnx models is that new architectures or network details can be tested easily by exporting any trained model to .onnx, without implementing every detail in CUDA/OpenCL backend. For the time being, there is no advantage of using this backend for normal users: It will be slower by 10-20% than CUDA/OpenCL version for 19x19 boards, though it can be faster for smaller boards.
Execution providers
Currently, four execution providers are available for this backend:
For windows systems, DirectML is considered the best in general. AMD cards can make use of MIGraphX, and ROCm execution provider can be supported once it is fully integrated in ONNXRuntime.
Building
GCC-9 or above is recommended for Linux systems.
ORT_LIB_DIR
: ONNXRuntime library locationORT_INCLUDE_DIR
: ONNXRuntime header file locationORT_CUDA
,ORT_TENSORRT
,ORT_DIRECTML
,ORT_MIGRAPHX
: Whether to support specific execution providersTENSORRT_LIB_DIR
,TENSORRT_INCLUDE_DIR
: Library and header file locations for TensorRTFor example, if you want to build ONNXRuntime with CUDA and DirectML support, then your CMake configuration will be like this:
Configuration
There are be two options in .cfg file for ONNXRuntime backend:
onnxOptModelFile
andonnxRuntimeExecutionProvider
.onnxOptModelFile
is the path for a cached graph-optimized onnx file.onnxRuntimeExecutionProvider
is one of the execution providers -CUDA
,TensorRT
,DirectML
, orMIGraphX
. These options can be properly set by runninggenconfig
.For FP16 inference, you will need to use FP16 models instead of normal FP32 models. There is no advantage of FP16 inference for non-RTX cards, but you will want to use FP16 models instead for RTX cards.
TODO
CMakeLists.txt