Check for continuous decoding in MHA #23766
+5
−0
Open
Azure Pipelines / Linux GPU CI Pipeline
failed
Feb 20, 2025 in 1h 7m 32s
Build #20250220.18 had test failures
Details
- Failed: 1 (0.01%)
- Passed: 11,435 (97.49%)
- Other: 294 (2.51%)
- Total: 11,730
Annotations
Check failure on line 1283253 in Build log
azure-pipelines / Linux GPU CI Pipeline
Build log #L1283253
Bash exited with code '1'.
Check failure on line 1 in SelfAttention_WithPast_WithAttnBias_ForT5
azure-pipelines / Linux GPU CI Pipeline
SelfAttention_WithPast_WithAttnBias_ForT5
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
expect_result
Which is: 4-byte object <00-00 00-00>
ExpectResult::kExpectFailure
Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running MultiHeadAttention node. Name:'node1' Status Message: Input 'query' is expected to have sequence_length == 1 when past_sequence_length > 1
Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider
Raw output
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
expect_result
Which is: 4-byte object <00-00 00-00>
ExpectResult::kExpectFailure
Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running MultiHeadAttention node. Name:'node1' Status Message: Input 'query' is expected to have sequence_length == 1 when past_sequence_length > 1
Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider
Loading