Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for continuous decoding in MHA #23766

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

check for sequence length

edeeb87
Select commit
Loading
Failed to load commit list.
Open

Check for continuous decoding in MHA #23766

check for sequence length
edeeb87
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Linux GPU CI Pipeline failed Feb 20, 2025 in 1h 7m 32s

Build #20250220.18 had test failures

Details

Tests

  • Failed: 1 (0.01%)
  • Passed: 11,435 (97.49%)
  • Other: 294 (2.51%)
  • Total: 11,730

Annotations

Check failure on line 1283253 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

Build log #L1283253

Bash exited with code '1'.

Check failure on line 1 in SelfAttention_WithPast_WithAttnBias_ForT5

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

SelfAttention_WithPast_WithAttnBias_ForT5

/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
  expect_result
    Which is: 4-byte object <00-00 00-00>
  ExpectResult::kExpectFailure
    Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running MultiHeadAttention node. Name:'node1' Status Message: Input 'query' is expected to have sequence_length == 1 when past_sequence_length > 1
Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider
Raw output
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
  expect_result
    Which is: 4-byte object <00-00 00-00>
  ExpectResult::kExpectFailure
    Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running MultiHeadAttention node. Name:'node1' Status Message: Input 'query' is expected to have sequence_length == 1 when past_sequence_length > 1
Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider