Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate crash due to empty size #23753

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Investigate crash due to empty size

981a714
Select commit
Loading
Failed to load commit list.
Open

Investigate crash due to empty size #23753

Investigate crash due to empty size
981a714
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Linux GPU CI Pipeline failed Feb 19, 2025 in 55m 17s

Build #20250219.6 had test failures

Details

Tests

  • Failed: 66 (0.56%)
  • Passed: 11,359 (96.93%)
  • Other: 294 (2.51%)
  • Total: 11,719

Annotations

Check failure on line 36799 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

Build log #L36799

Bash exited with code '1'.

Check failure on line 1 in CheckRunProfilerWithSessionOptions

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

CheckRunProfilerWithSessionOptions

/onnxruntime_src/onnxruntime/test/framework/inference_session_test.cc:679
Value of: has_kernel_info
  Actual: false
Expected: true
Raw output
/onnxruntime_src/onnxruntime/test/framework/inference_session_test.cc:679
Value of: has_kernel_info
  Actual: false
Expected: true

Check failure on line 1 in GptBeamSearchWithInitDecoderFp16

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

GptBeamSearchWithInitDecoderFp16

unknown file
C++ exception with description "Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_gpt2' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN failure 4000: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=c6a9e2c5d00a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=76 ; expr=cudnnCreate(&cudnn_handle_); 

" thrown in the test body.
Raw output
unknown file
C++ exception with description "Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_gpt2' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN failure 4000: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=c6a9e2c5d00a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=76 ; expr=cudnnCreate(&cudnn_handle_); 

" thrown in the test body.

Check failure on line 1 in DummyT5

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

DummyT5

/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
  expect_result
    Which is: 4-byte object <00-00 00-00>
  ExpectResult::kExpectFailure
    Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running BeamSearch node. Name:'' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN failure 4000: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=c6a9e2c5d00a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=76 ; expr=cudnnCreate(&cudnn_handle_); 


Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider
Raw output
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
  expect_result
    Which is: 4-byte object <00-00 00-00>
  ExpectResult::kExpectFailure
    Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running BeamSearch node. Name:'' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN failure 4000: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=c6a9e2c5d00a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=76 ; expr=cudnnCreate(&cudnn_handle_); 


Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider

Check failure on line 1 in DummyT5WithOuterScopeInitializers

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Linux GPU CI Pipeline

DummyT5WithOuterScopeInitializers

/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
  expect_result
    Which is: 4-byte object <00-00 00-00>
  ExpectResult::kExpectFailure
    Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running BeamSearch node. Name:'' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN failure 4000: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=c6a9e2c5d00a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=76 ; expr=cudnnCreate(&cudnn_handle_); 


Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider
Raw output
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337
Expected equality of these values:
  expect_result
    Which is: 4-byte object <00-00 00-00>
  ExpectResult::kExpectFailure
    Which is: 4-byte object <01-00 00-00>
Run failed but expected success: Non-zero status code returned while running BeamSearch node. Name:'' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; SUCCTYPE = cudnnStatus_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN failure 4000: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=c6a9e2c5d00a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=76 ; expr=cudnnCreate(&cudnn_handle_); 


Google Test trace:
/onnxruntime_src/onnxruntime/test/providers/base_tester.cc:833: registered execution providers: CUDAExecutionProvider