Quarkus - Fault Tolerance

One of the challenges brought by the distributed nature of microservices is that communication with external systems isinherently unreliable. This increases demand on resiliency of applications. To simplify making more resilientapplications, Quarkus contains an implementation of the MicroProfile Fault Tolerance specification.

In this guide, we demonstrate usage of MicroProfile Fault Tolerance annotations such as @Timeout, @Fallback,@Retry and @CircuitBreaker.

Prerequisites

To complete this guide, you need:

  • less than 15 minutes

  • an IDE

  • JDK 1.8+ installed with JAVA_HOME configured appropriately

  • Apache Maven 3.5.3+

The Scenario

The application built in this guide simulates a simple backend for a gourmet coffee e-shop. It implements a RESTendpoint providing information about coffee samples we have on store.

Let’s imagine, although it’s not implemented as such, that some of the methods in our endpoint require communicationto external services like a database or an external microservice, which introduces a factor of unreliability.

Solution

We recommend that you follow the instructions in the next sections and create the application step by step.However, you can go right to the completed example.

Clone the Git repository: git clone https://github.com/quarkusio/quarkus-quickstarts.git, or download an archive.

The solution is located in the microprofile-fault-tolerance-quickstart directory.

Creating the Maven Project

First, we need a new project. Create a new project with the following command:

  1. mvn io.quarkus:quarkus-maven-plugin:1.0.0.CR1:create \
  2. -DprojectGroupId=org.acme \
  3. -DprojectArtifactId=microprofile-fault-tolerance-quickstart \
  4. -DclassName="org.acme.faulttolerance.CoffeeResource" \
  5. -Dpath="/coffee" \
  6. -Dextensions="smallrye-fault-tolerance, resteasy-jsonb"
  7. cd microprofile-fault-tolerance-quickstart

This command generates a Maven structure, importing the extensions for RESTEasy/JAX-RSand Smallrye Fault Tolerance, which is an implementation of the MicroProfile Fault Tolerance spec that Quarkus uses.

Preparing an Application: REST Endpoint and CDI Bean

In this section we create a skeleton of our application, so that we have something that we can extend and to whichwe can add fault tolerance features later on.

First, create a simple entity representing a coffee sample in our store:

  1. package org.acme.faulttolerance;
  2. public class Coffee {
  3. public Integer id;
  4. public String name;
  5. public String countryOfOrigin;
  6. public Integer price;
  7. public Coffee() {
  8. }
  9. public Coffee(Integer id, String name, String countryOfOrigin, Integer price) {
  10. this.id = id;
  11. this.name = name;
  12. this.countryOfOrigin = countryOfOrigin;
  13. this.price = price;
  14. }
  15. }

Let’s continue with a simple CDI bean, that would work as a repository of our coffee samples.

  1. package org.acme.faulttolerance;
  2. import java.util.ArrayList;
  3. import java.util.Collections;
  4. import java.util.HashMap;
  5. import java.util.List;
  6. import java.util.Map;
  7. import java.util.stream.Collectors;
  8. import javax.enterprise.context.ApplicationScoped;
  9. @ApplicationScoped
  10. public class CoffeeRepositoryService {
  11. private Map<Integer, Coffee> coffeeList = new HashMap<>();
  12. public CoffeeRepositoryService() {
  13. coffeeList.put(1, new Coffee(1, "Fernandez Espresso", "Colombia", 23));
  14. coffeeList.put(2, new Coffee(2, "La Scala Whole Beans", "Bolivia", 18));
  15. coffeeList.put(3, new Coffee(3, "Dak Lak Filter", "Vietnam", 25));
  16. }
  17. public List<Coffee> getAllCoffees() {
  18. return new ArrayList<>(coffeeList.values());
  19. }
  20. public Coffee getCoffeeById(Integer id) {
  21. return coffeeList.get(id);
  22. }
  23. public List<Coffee> getRecommendations(Integer id) {
  24. if (id == null) {
  25. return Collections.emptyList();
  26. }
  27. return coffeeList.values().stream()
  28. .filter(coffee -> !id.equals(coffee.id))
  29. .limit(2)
  30. .collect(Collectors.toList());
  31. }
  32. }

Finally, edit the org.acme.faulttolerance.CoffeeResource class as follows:

  1. package org.acme.faulttolerance;
  2. import java.util.List;
  3. import java.util.Random;
  4. import java.util.concurrent.atomic.AtomicLong;
  5. import javax.inject.Inject;
  6. import javax.ws.rs.GET;
  7. import javax.ws.rs.Path;
  8. import javax.ws.rs.Produces;
  9. import javax.ws.rs.core.MediaType;
  10. import org.jboss.logging.Logger;
  11. @Path("/coffee")
  12. @Produces(MediaType.APPLICATION_JSON)
  13. public class CoffeeResource {
  14. private static final Logger LOGGER = Logger.getLogger(CoffeeResource.class);
  15. @Inject
  16. private CoffeeRepositoryService coffeeRepository;
  17. private AtomicLong counter = new AtomicLong(0);
  18. @GET
  19. public List<Coffee> coffees() {
  20. final Long invocationNumber = counter.getAndIncrement();
  21. maybeFail(String.format("CoffeeResource#coffees() invocation #%d failed", invocationNumber));
  22. LOGGER.infof("CoffeeResource#coffees() invocation #%d returning successfully", invocationNumber);
  23. return coffeeRepository.getAllCoffees();
  24. }
  25. private void maybeFail(String failureLogMessage) {
  26. if (new Random().nextBoolean()) {
  27. LOGGER.error(failureLogMessage);
  28. throw new RuntimeException("Resource failure.");
  29. }
  30. }
  31. }

At this point, we expose a single REST method that will show a list of coffee samples in a JSON format. Notethat we introduced some fault making code in our CoffeeResource#maybeFail() method, which is going to cause failuresin the CoffeeResource#coffees() endpoint method in about 50 % of requests.

Why not check that our application works? Run the Quarkus development server with:

  1. ./mvnw compile quarkus:dev

and open http://localhost:8080/coffee in your browser. Make couple of requests (remember, some of them we expectto fail). At least some of the requests should show us the list of our coffee samples in JSON, the rest will failwith a RuntimeException thrown in CoffeeResource#maybeFail().

Congratulations, you’ve just made a working (although somewhat unreliable) Quarkus application!

Adding Resiliency: Retries

Let the Quarkus development server running and in your IDE add the @Retry annotation to the CoffeeResource#coffees()method as follows and save the file:

  1. import org.eclipse.microprofile.faulttolerance.Retry;
  2. ...
  3. public class CoffeeResource {
  4. ...
  5. @GET
  6. @Retry(maxRetries = 4)
  7. public List<Coffee> coffees() {
  8. ...
  9. }
  10. ...
  11. }

Hit refresh in your browser. The Quarkus development server will automatically detect the changesand recompile the app for you, so there’s no need to restart it.

You can hit refresh couple more times. Practically all requests should now be succeeding. The CoffeeResource#coffees()method is still in fact failing in about 50 % of time, but every time it happens, the platform will automatically retrythe call!

To see that that the failures still happen, check the output of the development server. The log messages should besimilar to these:

  1. 2019-03-06 12:17:41,725 INFO [org.acm.fau.CoffeeResource] (XNIO-1 task-1) CoffeeResource#coffees() invocation #5 returning successfully
  2. 2019-03-06 12:17:44,187 INFO [org.acm.fau.CoffeeResource] (XNIO-1 task-1) CoffeeResource#coffees() invocation #6 returning successfully
  3. 2019-03-06 12:17:45,166 ERROR [org.acm.fau.CoffeeResource] (XNIO-1 task-1) CoffeeResource#coffees() invocation #7 failed
  4. 2019-03-06 12:17:45,172 ERROR [org.acm.fau.CoffeeResource] (XNIO-1 task-1) CoffeeResource#coffees() invocation #8 failed
  5. 2019-03-06 12:17:45,176 INFO [org.acm.fau.CoffeeResource] (XNIO-1 task-1) CoffeeResource#coffees() invocation #9 returning successfully

You can see that every time an invocation fails, it’s immediately followed by another invocation, until one succeeds.Since we allowed 4 retries, it would require 5 invocations to fail in a row, in order for the user to be actually exposedto a failure. Which is fairly unlikely to happen.

Adding Resiliency: Timeouts

So what else have we got in MicroProfile Fault Tolerance? Let’s look into timeouts.

Add following two methods to our CoffeeResource endpoint. Again, no need to restart the server, just paste the codeand save the file.

  1. import org.jboss.resteasy.annotations.jaxrs.PathParam;
  2. import org.eclipse.microprofile.faulttolerance.Timeout;
  3. ...
  4. public class CoffeeResource {
  5. ...
  6. @GET
  7. @Path("/{id}/recommendations")
  8. @Timeout(250)
  9. public List<Coffee> recommendations(@PathParam int id) {
  10. long started = System.currentTimeMillis();
  11. final long invocationNumber = counter.getAndIncrement();
  12. try {
  13. randomDelay();
  14. LOGGER.infof("CoffeeResource#recommendations() invocation #%d returning successfully", invocationNumber);
  15. return coffeeRepository.getRecommendations(id);
  16. } catch (InterruptedException e) {
  17. LOGGER.errorf("CoffeeResource#recommendations() invocation #%d timed out after %d ms",
  18. invocationNumber, System.currentTimeMillis() - started);
  19. return null;
  20. }
  21. }
  22. private void randomDelay() throws InterruptedException {
  23. Thread.sleep(new Random().nextInt(500));
  24. }
  25. }

We added some new functionality. We want to be able to recommend some related coffees based on a coffee that a useris currently looking at. It’s not a critical functionality, it’s a nice-to-have. When the system is overloaded and thelogic behind obtaining recommendations takes too long to execute, we would rather time out and render the UI withoutrecommendations.

Note that the timeout was configured to 250 ms, and a random artificial delay between 0 to 500 ms was introducedinto the CoffeeResource#recommendations() method.

In your browser, go to http://localhost:8080/coffee/2/recommendations and hit refresh a couple of times.

You should see some requests time out with org.eclipse.microprofile.faulttolerance.exceptions.TimeoutException.Requests that do not time out should show two recommended coffee samples in JSON.

Adding Resiliency: Fallbacks

Let’s make our recommendations feature even better by providing a fallback (and presumably faster) way of getting relatedcoffees.

Add a fallback method to CaffeeResource and a @Fallback annotation to CoffeeResource#recommendations() methodas follows:

  1. import java.util.Collections;
  2. import org.eclipse.microprofile.faulttolerance.Fallback;
  3. ...
  4. public class CoffeeResource {
  5. ...
  6. @Fallback(fallbackMethod = "fallbackRecommendations")
  7. public List<Coffee> recommendations(@PathParam int id) {
  8. ...
  9. }
  10. public List<Coffee> fallbackRecommendations(int id) {
  11. LOGGER.info("Falling back to RecommendationResource#fallbackRecommendations()");
  12. // safe bet, return something that everybody likes
  13. return Collections.singletonList(coffeeRepository.getCoffeeById(1));
  14. }
  15. ...
  16. }

Hit refresh several times on http://localhost:8080/coffee/2/recommendations.The TimeoutException should not appear anymore. Instead, in case of a timeout, the page willdisplay a single recommendation that we hardcoded in our fallback method fallbackRecommendations(), rather thantwo recommendations returned by the original method.

Check the server output to see that fallback is really happening:

  1. 2019-03-06 13:21:54,170 INFO [org.acm.fau.CoffeeResource] (pool-15-thread-3) CoffeeResource#recommendations() invocation #2 returning successfully
  2. 2019-03-06 13:21:55,159 ERROR [org.acm.fau.CoffeeResource] (pool-15-thread-4) CoffeeResource#recommendations() invocation #3 timed out after 248 ms
  3. 2019-03-06 13:21:55,161 INFO [org.acm.fau.CoffeeResource] (HystrixTimer-1) Falling back to RecommendationResource#fallbackRecommendations()
The fallback method is required to have the same parameters as the original method.

Adding Resiliency: Circuit Breaker

A circuit breaker is useful for limiting number of failures happening in the system, when part of the system becomestemporarily unstable. The circuit breaker records successful and failed invocations of a method, and when the ratioof failed invocations reaches the specified threshold, the circuit breaker opens and blocks all further invocationsof that method for a given time.

Add the following code into the CoffeeRepositoryService bean, so that we can demonstrate a circuit breaker in action:

  1. import java.util.concurrent.atomic.AtomicLong;
  2. import org.eclipse.microprofile.faulttolerance.CircuitBreaker;
  3. ...
  4. public class CoffeeRepositoryService {
  5. ...
  6. private AtomicLong counter = new AtomicLong(0);
  7. @CircuitBreaker(requestVolumeThreshold = 4)
  8. public Integer getAvailability(Coffee coffee) {
  9. maybeFail();
  10. return new Random().nextInt(30);
  11. }
  12. private void maybeFail() {
  13. // introduce some artificial failures
  14. final Long invocationNumber = counter.getAndIncrement();
  15. if (invocationNumber % 4 > 1) { // alternate 2 successful and 2 failing invocations
  16. throw new RuntimeException("Service failed.");
  17. }
  18. }
  19. }

And inject the code bellow into the CoffeeResource endpoint:

  1. public class CoffeeResource {
  2. ...
  3. @Path("/{id}/availability")
  4. @GET
  5. public Response availability(@PathParam int id) {
  6. final Long invocationNumber = counter.getAndIncrement();
  7. Coffee coffee = coffeeRepository.getCoffeeById(id);
  8. // check that coffee with given id exists, return 404 if not
  9. if (coffee == null) {
  10. return Response.status(Response.Status.NOT_FOUND).build();
  11. }
  12. try {
  13. Integer availability = coffeeRepository.getAvailability(coffee);
  14. LOGGER.infof("CoffeeResource#availability() invocation #%d returning successfully", invocationNumber);
  15. return Response.ok(availability).build();
  16. } catch (RuntimeException e) {
  17. String message = e.getClass().getSimpleName() + ": " + e.getMessage();
  18. LOGGER.errorf("CoffeeResource#availability() invocation #%d failed: %s", invocationNumber, message);
  19. return Response.status(Response.Status.INTERNAL_SERVER_ERROR)
  20. .entity(message)
  21. .type(MediaType.TEXT_PLAIN_TYPE)
  22. .build();
  23. }
  24. }
  25. ...
  26. }

We added another functionality - the application can return the amount of remaining packages of given coffee on our store(just a random number).

This time an artificial failure was introduced in the CDI bean: the CoffeeRepositoryService#getAvailability() method isgoing to alternate between two successful and two failed invocations.

We also added a @CircuitBreaker annotation with requestVolumeThreshold = 4. CircuitBreaker.failureRatio isby default 0.5, and CircuitBreaker.delay is by default 5 seconds. That means that a circuit breaker will openwhen 2 of the last 4 invocations failed and it will stay open for 5 seconds.

To test this out, do the following:

  • Go to http://localhost:8080/coffee/2/availability in your browser. You should see a number being returned.

  • Hit refresh, this second request should again be successful and return a number.

  • Refresh two more times. Both times you should see text "RuntimeException: Service failed.", which is the exceptionthrown by CoffeeRepositoryService#getAvailability().

  • Refresh a couple more times. Unless you waited too long, you should again see exception, but this time it’s"CircuitBreakerOpenException: getAvailability". This exception indicates that the circuit breaker openedand the CoffeeRepositoryService#getAvailability() method is not being called anymore.

  • Give it 5 seconds during which circuit breaker should close and you should be able to make two successful requestsagain.

Conclusion

MicroProfile Fault Tolerance allows to improve resiliency of your application, without having an impact on the complexityof our business logic.

All that is needed to enable the fault tolerance features in Quarkus is:

  • adding the smallrye-fault-tolerance Quarkus extension to your project using the quarkus-maven-plugin:
  1. ./mvnw quarkus:add-extension -Dextensions="smallrye-fault-tolerance"
  • or simply adding the following Maven dependency:
  1. <dependency>
  2. <groupId>io.quarkus</groupId>
  3. <artifactId>quarkus-smallrye-fault-tolerance</artifactId>
  4. </dependency>