-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: reader failover wait for complete batch #390
base: main
Are you sure you want to change the base?
Conversation
21f2ead
to
ef40f06
Compare
ef40f06
to
1d4de58
Compare
let errors: string = ""; | ||
for (const e of error.errors) { | ||
// Propagate errors that are not caused by network errors. | ||
if (!this.pluginService.isNetworkError(error)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the goal of this check here? Are we just checking to see if the aggregate error contains the AwsWrapperError we throw to indicate task aborted?
throw new AwsWrapperError("Selected task has been chosen. Abort client for host: " + this.newHost.host);
In this case would it be more clear if we just check for error instanceof AwsWrapperError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also be more specific and create a new ReaderTaskAbortedError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this check we want to ignore any network errors so we can continue trying connections to other instances (for example, connect ECONNREFUSED
). If there are non-network errors, they will be propagated back to the user through the ReaderFailoverResult. This catch block should only be hit when both connection attempts in the batch failed, but the selected task error is only thrown when there has been a successful connection.
Messages.get( | ||
"ClusterAwareReaderFailoverHandler.batchFailed", | ||
`[${hosts[i].hostId}${numTasks === 2 ? `, ${hosts[i + 1].hostId}` : ``}]`, | ||
`[\n${errors}\n]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious if we need to build this custom message or if we could just use the aggregate error's message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return a ReaderFailoverResult with an error here it will be sent back to the user, so I'd prefer for them to see an AwsWrapperError with some additional context rather than an AggregateError
04725a9
to
4688bc6
Compare
4688bc6
to
665a9ea
Compare
Summary
Fix connections being unclosed in reader failover handler
Description
Currently connection attempts are made in batches of two, and the first settled promise is returned. However, if the first connection task fails, the second is also abandoned. This PR changes the reader failover implementation to wait for the entire batch to complete before moving on.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.