Skip to content

Commit

Permalink
add limitation for BFLOAT supported ops for NNPA (#3008)
Browse files Browse the repository at this point in the history
Signed-off-by: Sunny Anand <[email protected]>
  • Loading branch information
Sunny-Anand authored Nov 13, 2024
1 parent fa91033 commit 890ebea
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion docs/SupportedONNXOps-NNPA.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Onnx-mlir currently supports ONNX operations targeting up to opset 21. Limitatio
* A * indicates onnx-mlir is compatible with the latest version of that operator available as of opset 21.


NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA.
NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA. NNPA currently only support DLFLOAT16 as its data type. Common data formats like FP32, FP16, BFLOAT need to undergo data conversions to the NNPA internal format DLFLOAT16. Hence ONNX ops which updated their tensors to BFLOAT16 will not be natively supported on NNPA.


| Op |Supported Opsets (inclusive) |Limitations |Notes |
Expand Down
2 changes: 1 addition & 1 deletion test/accelerators/NNPA/backend/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ endif()
set(NNPA_TEST_LIST

# ==ARCH== NNPA
# ==ADDITIONAL_PARAGRAPH== NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA.
# ==ADDITIONAL_PARAGRAPH== NNPA has hardware limitations in dimension index size and tensor size, which are described in [NNPALimit.hpp](../src/Accelerators/NNPA/Support/NNPALimit.hpp). They are large enough for normal use cases, but if your model exceeds the limitations, CPU is used instead of NNPA. NNPA currently only support DLFLOAT16 as its data type. Common data formats like FP32, FP16, BFLOAT need to undergo data conversions to the NNPA internal format DLFLOAT16. Hence ONNX ops which updated their tensors to BFLOAT16 will not be natively supported on NNPA.

# ==OP== Add
# ==MIN== 6
Expand Down

0 comments on commit 890ebea

Please sign in to comment.