fix: allow mqtt client to reconnect if it gets disconnected #977
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
MQTT client reconnections were disabled for seemingly no reason. Combined with the fact that the SocketClient's various error and disconnection events were completely ignored, a random disconnection before the expected final message was received would almost certainly cause the CLI to essentially hang, doing nothing until the failsafe timeout (currently 10m) hit.
There is an additional edge case which is not covered by this change. Even now that reconnections are enabled, in the unlucky case that the final message is distributed when our client is not connected (i.e. in the process waiting to reconnect, or actively reconnecting), then we'll still miss it and hang the same way. To reduce the odds of this edge condition happening, the reconnect wait timeout was reduced to 100ms from the default of 1000ms.
To invoke the reconnection behavior (and to try to figure out whether there was any reason why reconnections were originally disabled), I wrote a simple HTTP proxy that acts like a normal HTTP proxy but disconnects all connections after a configurable duration. Then I ran the CLI with the
https_proxy
environment variable and confirmed that reconnections were successful when the proxy closed connections.I hereby confirm that I followed the code guidelines found at engineering guidelines
Affected Components
Notes for the Reviewer
New Dependency Submission