This guide demonstrates how to configure retry policy for a client and server application within the service mesh.

Prerequisites

  • Kubernetes cluster running Kubernetes v1.22.9 or greater.
  • Have kubectl available to interact with the API server.
  • Have osm CLI available for managing the service mesh.

Demo

  1. Install OSM with permissive mode and retry policy enabled.

    1. osm install --set=osm.enablePermissiveTrafficPolicy=true --set=osm.featureFlags.enableRetryPolicy=true
  2. Deploy the httpbin service into the httpbin namespace after enrolling its namespace to the mesh. The httpbin service runs on port 14001.

    1. kubectl create namespace httpbin
    2. osm namespace add httpbin
    3. kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/release-v1.2/manifests/samples/httpbin/httpbin.yaml -n httpbin

    Confirm the httpbin service and pods are up and running.

    1. kubectl get svc,pod -n httpbin

    Should look similar to below

    1. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    2. httpbin ClusterIP 10.96.198.23 <none> 14001/TCP 20s
    3. NAME READY STATUS RESTARTS AGE
    4. httpbin-5b8b94b9-lt2vs 2/2 Running 0 20s
  3. Deploy the curl into the curl namespace after enrolling its namespace to the mesh.

    1. kubectl create namespace curl
    2. osm namespace add curl
    3. kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/release-v1.2/manifests/samples/curl/curl.yaml -n curl

    Confirm the curl pod is up and running.

    1. kubectl get pods -n curl

    Should look similar to below

    1. NAME READY STATUS RESTARTS AGE
    2. curl-54ccc6954c-9rlvp 2/2 Running 0 20s
  4. Apply the Retry policy to retry when the curl ServiceAccount receives a 5xx code when sending a request to httpbin Service.

    1. kubectl apply -f - <<EOF
    2. kind: Retry
    3. apiVersion: policy.openservicemesh.io/v1alpha1
    4. metadata:
    5. name: retry
    6. namespace: curl
    7. spec:
    8. source:
    9. kind: ServiceAccount
    10. name: curl
    11. namespace: curl
    12. destinations:
    13. - kind: Service
    14. name: httpbin
    15. namespace: httpbin
    16. retryPolicy:
    17. retryOn: "5xx"
    18. perTryTimeout: 1s
    19. numRetries: 5
    20. retryBackoffBaseInterval: 1s
    21. EOF
  5. Send a HTTP request that returns status code 503 from the curl pod to the httpbin service.

    1. kubectl exec deploy/curl -n curl -c curl -- curl -sI httpbin.httpbin.svc.cluster.local:14001/status/503
  6. In a new terminal session, run the following command to port-forward the curl pod.

    1. kubectl port-forward deploy/curl -n curl 15000
  7. Query for the stats between curl to httpbin.

    1. curl -s localhost:15000/stats | grep "cluster.httpbin/httpbin|14001.upstream_rq_retry"

    The number of times the request from the curl pod to the httpbin pod was retried using the exponential backoff retry should be equal to the numRetries field in the retry policy. The upstream_rq_retry_limit_exceeded stat shows the number of requests not retried because it’s more than the maximum retries allowed - numRetries.

    1. cluster.httpbin/httpbin|14001.upstream_rq_retry: 5
    2. cluster.httpbin/httpbin|14001.upstream_rq_retry_backoff_exponential: 5
    3. cluster.httpbin/httpbin|14001.upstream_rq_retry_backoff_ratelimited: 0
    4. cluster.httpbin/httpbin|14001.upstream_rq_retry_limit_exceeded: 1
    5. cluster.httpbin/httpbin|14001.upstream_rq_retry_overflow: 0
    6. cluster.httpbin/httpbin|14001.upstream_rq_retry_success: 0
  8. Send a HTTP request that returns a non-5xx status code from the curl pod to the httpbin service.

    1. kubectl exec deploy/curl -n curl -c curl -- curl -sI httpbin.httpbin.svc.cluster.local:14001/status/404
  9. The envoy_cluster_upstream_rq_retry metric does not increment since the retry policy is set to retry on 5xx