Health checks
Health is an extremely important part of the microservice architecture, load balancers rely on the health status of the service while picking endpoint from load balancer set. Also, users want the service state to be observable through the GUI or CLI. Orchestrators like Kubernetes also want to know the service status to manage lifecycle of the containers.
Kuma supports several aspects of the health checking. There are two policies which allows configuring active and passive health checks: Health Check and Circuit Breaker.
Kuma is able to track the status of the Envoy proxy. If grpc stream with Envoy is disconnected then Kuma considers this proxy as offline, but we still send the traffic regardless of that, because this status is designed to track the connection between kuma-cp and kuma-dp.
Also, every inbound in the Dataplane model has health section:
type: Dataplanemesh: defaultname: web-01networking:address: 127.0.0.1inbound:- port: 11011servicePort: 11012health:ready: truetags:kuma.io/service: backendkuma.io/protocol: http
This health.ready status is intended to show the status of the service itself. It is set differently depending on the environment (Kubernetes or Universal), but it’s treated the same way regardless of the environment:
- if
health.readyis true orhealthsection is missing - Kuma considers the inbound as healthy and includes it into load balancing set. - if
health.readyis false - Kuma doesn’t include the inbound into load balancing set.
Also, health.ready is used to compute the status of the Dataplanes and Service. You can see these statuses both in Kuma GUI and Kuma CLI:
- if proxy status is
Offline, then Dataplane isOffline: - if proxy status is
Online:- if all inbounds are ready then Dataplane is
Online - if all inbounds are not ready then Dataplane is
Offline - if at least one of the inbounds is not ready then Dataplane is
Partially degraded - if inbound is not ready then it’s not included in the load-balancer set which means it doesn’t receive the traffic
- if all inbounds which implement the same service are ready then service is
Online - if all inbounds which implement the same service are not ready then service is
Offline - if at least one of the inbounds which implement the same service is not ready then service is
Partially degraded
- if all inbounds are ready then Dataplane is
Kubernetes probes
Even if Kubernetes probes are disabled, Kuma takes pod.status.containerStatuses.ready in order to fill dataplane.inbound.health section.
If you specify httpGet probe for the Pod, Kuma will generate a special non-MTLs listener and overrides the probe itself in the Pod resource. This feature is called Virtual probes, and it allows kubelet probing the pod status even if MTLS is enabled on the mesh. For example, if we specify the following probe:
livenessProbe:httpGet:path: /metricsport: 3001initialDelaySeconds: 3periodSeconds: 3
Kuma will replace it with:
livenessProbe:httpGet:path: /3001/metricsport: 9000initialDelaySeconds: 3periodSeconds: 3
Where 9000 is a default virtual probe port, which can be configured in kuma-cp.config:
runtime:kubernetes:injector:virtualProbesPort: 19001
And can also be overwritten in the Pod’s annotations:
annotations:kuma.io/virtual-probes-port: 19001
To disable Kuma’s probe virtualziation, we can either set it in Kuma’s configuration file kuma-cp.config:
runtime:kubernetes:injector:virtualProbesEnabled: false
or in the Pod’s annotations:
annotations:kuma.io/virtual-probes: disabled
The same behaviour could be configured using environment variables:
KUMA_RUNTIME_KUBERNETES_VIRTUAL_PROBES_ENABLED=falseKUMA_RUNTIME_KUBERNETES_VIRTUAL_PROBES_ENABLED=19001
Universal probes
On Universal there is no single standard for probing the service. For health checking of the service status on Universal Kuma is using Envoy’s Health Discovery Service (HDS). Envoy does health checks and reports the status back to Kuma Control Plane.
In order to configure health checking of your service you have to update inbound config with serviceProbe:
type: Dataplanemesh: defaultname: web-01networking:address: 127.0.0.1inbound:- port: 11011servicePort: 11012serviceProbe:timeout: 2s # optional (default value is taken from KUMA_DP_SERVER_HDS_CHECK_TIMEOUT)interval: 1s # optional (default value is taken from KUMA_DP_SERVER_HDS_CHECK_INTERVAL)healthyThreshold: 1 # optional (default value is taken from KUMA_DP_SERVER_HDS_CHECK_HEALTHY_THRESHOLD)unhealthThreshold: 1 # optional (default value is taken from KUMA_DP_SERVER_HDS_CHECK_UNHEALTHY_THRESHOLD)tcp: {}tags:kuma.io/service: backendkuma.io/protocol: http
If there is a serviceProbe configured for the inbound, Kuma will automatically fill the health section and update it with interval equal to KUMA_DP_SERVER_HDS_REFRESH_INTERVAL. Alternatively, it’s possible to omit a serviceProbe section and develop custom automation that periodically updates the health of the inbound.
Comparing to HealthCheck policy serviceProbes have some advantages:
- knowledge about health is propagated back to
kuma-cpand could be seen both in Kuma GUI and Kuma CLI - scalable with thousands of Dataplanes
but at the same time unlike HealthCheck policy serviceProbes:
- works only when
kuma-cpis up and running - can’t check TLS between Envoys
