From 890ebea2d0fb79e8f7ad818d68c7dadc024f52ae Mon Sep 17 00:00:00 2001 From: Sunny Anand <164108690+Sunny-Anand@users.noreply.github.com> Date: Wed, 13 Nov 2024 09:59:39 -0600 Subject: [PATCH] add limitation for BFLOAT supported ops for NNPA (#3008) Signed-off-by: Sunny Anand --- docs/SupportedONNXOps-NNPA.md | 2 +- test/accelerators/NNPA/backend/CMakeLists.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/SupportedONNXOps-NNPA.md b/docs/SupportedONNXOps-NNPA.md index b8c1536457..80fa3287cf 100644 --- a/docs/SupportedONNXOps-NNPA.md +++ b/docs/SupportedONNXOps-NNPA.md @@ -10,7 +10,7 @@ Onnx-mlir currently supports ONNX operations targeting up to opset 21. Limitatio * A * indicates onnx-mlir is compatible with the latest version of that operator available as of opset 21. -NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA. +NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA. NNPA currently only support DLFLOAT16 as its data type. Common data formats like FP32, FP16, BFLOAT need to undergo data conversions to the NNPA internal format DLFLOAT16. Hence ONNX ops which updated their tensors to BFLOAT16 will not be natively supported on NNPA. | Op |Supported Opsets (inclusive) |Limitations |Notes | diff --git a/test/accelerators/NNPA/backend/CMakeLists.txt b/test/accelerators/NNPA/backend/CMakeLists.txt index 175f47d4a9..c2c946615c 100644 --- a/test/accelerators/NNPA/backend/CMakeLists.txt +++ b/test/accelerators/NNPA/backend/CMakeLists.txt @@ -104,7 +104,7 @@ endif() set(NNPA_TEST_LIST # ==ARCH== NNPA - # ==ADDITIONAL_PARAGRAPH== NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA. + # ==ADDITIONAL_PARAGRAPH== NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA. NNPA currently only support DLFLOAT16 as its data type. Common data formats like FP32, FP16, BFLOAT need to undergo data conversions to the NNPA internal format DLFLOAT16. Hence ONNX ops which updated their tensors to BFLOAT16 will not be natively supported on NNPA. # ==OP== Add # ==MIN== 6