-
Notifications
You must be signed in to change notification settings - Fork 298
Retry Logic Overview (WIP)
This document exists to be the authoritative document on retrying requests over a network. There are quite a few places where this applies during the replication process. This will cover what should happen in the event of both a transient and permanent error. A transient error is one that is expected to pass given a relatively short period of time (such as a connection timeout, or a 503). A permanent error is the opposite (such as a 401 or 404), and is not likely to recover without intervention. This document will not cover other replication logic such as "going offline."
The flow of the replication retry follows:
- Replication attempts to start
- Replication attempts to continue
- A connection error occurs
- 3a The connection error indicates lack of connectivity, go to 6
- 3b The connection error is transient, go to 4
- 3c The connection error is permanent, stop the replication
- Retry according to the applied retry strategy (not customizable on all platforms)
- 4a The retry strategy fails, go to 5
- 4b The retry strategy succeeds, go to 2
- At this point the request in question has failed to send and/or get a response
- 5a The replication is continuous. Switch to idle, set last error, enter long delay (~60 sec) and go to 1
- 5b The replication is non-continuous. Set last error, give up and stop the replication
- The endpoint is not reachable
- 6a The device has no network connection. Switch to offline, set last error, and wait for network connection change.
- 6b The device has a network connection. Switch to offline, set last error, enter long delay (~60 sec) and go to 1
Start non-continuous replication
Initial connection reports 401 (Unauthorized)
Stop replication, callback for error and stopped status (two notifications)
Start non-continuous replication
Halfway through, a 503 error is encountered (Service Unavailable)
Error is transient, so retry
Retry succeeds, replication continues
Start non-continuous replication
Halfway through, a connection time out happens
Error is transient, so retry
Retry failed, replication stops
Start a continuous replication
A 404 error is encountered on the endpoint
Permanent error, so stop the replication
void ErrorEncountered(Exception e)
{
if(IsTransient(e)) {
// 3b
if(_strategy.CanRetry) {
_strategy.Retry();
return:
}
}
HandleErrorEndgame(e);
}
void HandleErrorEndgame(Exception e)
{
if(IsContinuous && IsTransient(e)) {
// 5a
EnterRetryLoop();
return;
}
if(IsOfflineError(e)) {
// 3a -> 6b
EnterOfflineLoop();
else {
// 3c
StopReplication();
}
}