Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding attempt count to error message #5834

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

RanVaknin
Copy link

Motivation and Context

Improve debugging visibility by making attempt count immediately visible in exception messages. Currently, attempt information is only available through suppressed exceptions, requiring additional code to access. This change makes attempts visible directly in the exception message, matching behavior in other AWS SDKs.

Modifications

  • Added attempt count tracking and message formatting in SdkException
  • Added SdkDiagnostics class to handle the formatting of the Diagnostics clause in error messages.
  • Modified AwsServiceException to include attempts in AWS error details
  • Modified RetryableStageHelper to track and propagate attempt counts during retries

Testing

  • Added new functional test ExceptionAttemptMessageBehaviorTest testing with different retry scenarios
  • Modified existing tests in AwsServiceExceptionTest and SdkExceptionSerializationTest to verify attempt count behavior
  • Added test cases in SdkExceptionMessageTest for new attempt count functionality

Visualization of changes:

Before change:

// non-retryable scenario
try {
    // request to a non-existent bucket
    s3Client.listObjects(g -> g.bucket("some-non-existent-bucket"));
} catch (AwsServiceException e) {
    System.out.println(e);
}
// outputs:
// NoSuchBucketException: The specified bucket does not exist (Service: S3, Status Code: 404, Request ID: abc123, Extended Request ID: xyzdef123456)


// retryable scenario
try {
    // request that might cause a retryable error (e.g., throttling)
    dynamoDbClient.getItem(...);
} catch (AwsServiceException e) {
    System.out.println(e);
}
// outputs:
// ThrottlingException: Rate of requests exceeds the allowed throughput (Service: DynamoDB, Status Code: 429, Request ID: abc123, Extended Request ID: xyzdef123456)
// Note: attempts only visible through e.getSuppressed()

After change:

// non-retryable scenario
try {
    // request to a non-existent bucket
    s3Client.listObjects(g -> g.bucket("some-non-existent-bucket"));
} catch (AwsServiceException e) {
    System.out.println(e);
}
// outputs:
// NoSuchBucketException: The specified bucket does not exist (Service: S3, Status Code: 404, Request ID: abc123, Extended Request ID: xyzdef123456) (SDK Diagnostics: numAttempts = 1)


// retryable scenario
try {
    // request that might cause a retryable error (e.g., throttling)
    dynamoDbClient.getItem(...);
} catch (AwsServiceException e) {
    System.out.println(e);
}
// outputs:
// ThrottlingException: Rate of requests exceeds the allowed throughput (Service: DynamoDB, Status Code: 429, Request ID: abc123, Extended Request ID: xyzdef123456) (SDK Diagnostics: numAttempts = 4)
// Note: attempts now visible directly in the exception message

Potential Impact:

Test or error handling code that relies on exact exception message matching might break as messages now include attempt count. We recommend using partial message matching for more resilient tests and error handling.

// may break
assertEquals("Resource not found", e.getMessage());
if (e.getMessage().equals("Resource not found")) { ... }

// more resilient approach
assertTrue(e.getMessage().contains("Resource not found"));
try {
    client.someOperation();
} catch (AwsServiceException e) {
	// may break
    if (e.getMessage().equals("Resource not found")) {
        handleNotFound();
    }
	// more resilient
    if (e.getMessage().contains("Resource not found")) {
        handleNotFound();
    }
}

@RanVaknin RanVaknin requested a review from a team as a code owner January 28, 2025 21:07
Copy link

sonarqubecloud bot commented Feb 3, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
71.2% Coverage on New Code (required ≥ 80%)
12.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

assertThat(e.numAttempts()).isEqualTo(6);
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.requestId("requestId")
.build();

assertThat(e.getMessage()).contains("(SDK Diagnostics: numAttempts = 6)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify the entire message so that we know there are no duplicated entries? I know it's not a good practice in general, but we are the owner of the client and we should make sure we emit the right logs :)

AwsServiceException exception = assertThrows(AwsServiceException.class,
() -> callAllTypes(client));

assertThat(exception.getMessage()).contains("numAttempts = 2");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, can we verify the entire message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants