Skip to content

Resiliency and Fault Tolerance with Polly

Ebubekir Dinç edited this page Dec 31, 2023 · 1 revision

Resiliency and Fault Tolerance with Polly

Resiliency and Fault Tolerance are important concepts in microservices architecture. Services are distributed among
several nodes and interact with one another through a network in a microservices architecture.
This implies that failures could happen anywhere in the system, which could have an effect on the reliability and
availability of the entire system.

A system's resilience is its capacity to tolerate failures and recover from failures. On the other side, the ability of
a system to function even in the face of errors is referred to as fault tolerance. Circuit breakers, retries, and timeouts
are a few of the strategies that can be used to create fault tolerance.

polly.png
https://github.com/App-vNext/Polly

In SuuCat resiliency and fault tolerance are implemented using Polly. Polly is perfect for this kind of work.
Polly is a .NET library that provides a number of policies that can be used to implement resiliency and
fault tolerance in a microservices architecture. Polly is generally known for being used in HTTP requests to repeat
the request when the desired response is not received, e.g.: TimeOut. We know that it is a bad practice in a
microservice architecture for services to be tightly coupled to each other via HTTP requests (except in extreme cases.
For example, a final price check of the items in the cart during checkout). Therefore, here we will consider
an error scenario that may occur during database creation while the application is starting up.

Now let's see how we use it in our project. The following code calls the MigrateDatabaseAndSeed() method to create
the database and seed it with data.

polly_programCs https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Assessment/src/WebUI/Program.cs

The MigrateDatabaseAndSeed() method uses the Polly library to implement a retry policy for a database seeding operation.
The Handle() method of the Policy class, which describes the kind of exception to be handled, is used to first build a
retry policy. In this instance, the policy is configured to handle any exceptions thrown while seeding the database.

polly_MigrateDatabaseAndSeed https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Assessment/src/Infrastructure/Persistence/ApplicationDbContextInitialiser.cs

The policy is then set up to use the WaitAndRetry() method to retry the action up to five times. The amount of time to wait
between retries is specified using the sleepDurationProvider option. The Math.Pow() function is used to calculate 2 raised
to the power of the retry attempt number in order to get the duration in this situation. As a result, the first retry will
wait for 2 seconds, the second for 4 seconds, the third for 8 seconds, and so on.

polly_docker_log1
Retrying 3 times with failure.

As you can see in the above Docker log, the database seeding operation is retried 3 times before it succeeds.
You can see in the image above that it gives an error like this:

Retrying MigrateDatabaseAndSeed 00:00:02 of RetryPolicy -4face1c8 at null, due to: Npgsql.NpgsqlException(0x80004005): Failed to connect….

The first retry is done after 2 seconds, the second retry is done after 4 seconds, and the third retry is done
after 8 seconds. The database seeding operation will be retried 3 times before it succeeds.
This can be tested by stopping the PostgreSQL AssessmentDB container on Docker and then starting it again,
while starting the Assessment API.

polly_docker_log2
Success after failures.

Here you can see that the database tables are successfully created after 3 tries. With this, we have seen how Polly
was used when there was an error to ensure resiliency. However, Polly is also used in areas such as Circuit Breaker,
Fallback, Hedging, Timeout, and Rate Limiter. See Polly's GitHub repository for more examples.


References

Polly Github