Layer 4 Networking & mTLS with Ztunnel

Ambient mode is currently in the Alpha phase.

Please do not use ambient mode in production, and be sure to thoroughly review the feature phase definitions before use.

In particular, there are known performance, stability, and security issues in the alpha release. There are also planned breaking changes, including some that will prevent upgrades. These limitations will be addressed before ambient mode moves to Beta.

Introduction

This guide describes in-depth the functionality and usage of the ztunnel proxy and Layer 4 networking functions in Istio ambient mesh. To simply try out Istio ambient mesh, follow the Ambient Quickstart instead. This guide follows a user journey and works through multiple examples to detail the design and architecture of Istio ambient. It is highly recommended to follow the topics linked below in sequence.

The ztunnel (Zero Trust Tunnel) component is a purpose-built per-node proxy for Istio ambient mesh. Since workload pods no longer require proxies running in sidecars in order to participate in the mesh, Istio in ambient mode is informally also referred to as “sidecar-less” mesh.

Pods/workloads using sidecar proxies can co-exist within the same mesh as pods that operate in ambient mode. Mesh pods that use sidecar proxies can also interoperate with pods in the same Istio mesh that are running in ambient mode. The term ambient mesh refers to an Istio mesh that has a superset of the capabilities and hence can support mesh pods that use either type of proxying.

The ztunnel node proxy is responsible for securely connecting and authenticating workloads within the ambient mesh. The ztunnel proxy is written in Rust and is intentionally scoped to handle L3 and L4 functions in the ambient mesh such as mTLS, authentication, L4 authorization and telemetry. Ztunnel does not terminate workload HTTP traffic or parse workload HTTP headers. The ztunnel ensures L3 and L4 traffic is efficiently and securely transported to waypoint proxies, where the full suite of Istio’s L7 functionality, such as HTTP telemetry and load balancing, is implemented. The term “Secure Overlay Networking” is used informally to collectively describe the set of L4 networking functions implemented in an ambient mesh via the ztunnel proxy. At the transport layer, this is implemented via an HTTP CONNECT-based traffic tunneling protocol called HBONE which is described in a later section of this guide.

Some use cases of Istio in ambient mode may be addressed solely via the L4 secure overlay networking features, and will not need L7 features thereby not requiring deployment of a waypoint proxy. Other use cases requiring advanced traffic management and L7 networking features will require deployment of a waypoint proxy. This guide focuses on functionality related to the L4 secure overlay network using ztunnel proxies. This guide refers to L7 only when needed to describe some L4 ztunnel function. Other guides are dedicated to cover the advanced L7 networking functions and the use of waypoint proxies in detail.

Application Deployment Use CaseIstio Ambient Mesh Configuration
Zero Trust networking via mutual-TLS, encrypted and tunneled data transport of client application traffic, L4 authorization, L4 telemetryBaseline Ambient Mesh with ztunnel proxy networking
Application requires L4 Mutual-TLS plus advanced Istio traffic management features (incl VirtualService, L7 telemetry, L7 Authorization)Full Istio Ambient Mesh configuration both ztunnel proxy and waypoint proxy based networking

Current Caveats

Ambient mode is currently in the Alpha phase.

Please do not use ambient mode in production, and be sure to thoroughly review the feature phase definitions before use.

In particular, there are known performance, stability, and security issues in the alpha release. There are also planned breaking changes, including some that will prevent upgrades. These limitations will be addressed before ambient mode moves to Beta.

The following is a list of feature restrictions or caveats in ambient mode alpha. These restrictions are planned to be addressed or removed in future releases.

  1. Kubernetes only: Istio in ambient mode is currently only supported for deployment on Kubernetes clusters. Deployment on non-Kubernetes endpoints such as virtual machines is not currently supported.

  2. No Istio multi-cluster support: Only single cluster deployments are currently supported for Istio ambient mode.

  3. TCP/IPv4 only: In the current release, TCP over IPv4 is the only protocol supported for transport on an Istio secure overlay tunnel (this includes protocols such as HTTP that run between application layer endpoints on top of the TCP/ IPv4 connection).

  4. Cannot transparently convert existing Istio deployments to ambient mode: Ambient mode can only be enabled on a new Istio mesh control plane that is deployed using the ambient istioctl profile or Helm configuration. An existing Istio mesh deployed using a sidecar profile cannot currently be dynamically switched to enable ambient mode.

  5. Restrictions with Istio PeerAuthentication: as of the time of writing, the PeerAuthentication resource is not supported by all components (i.e. waypoint proxies) in Istio ambient mode. Hence it is recommended to only use the STRICT mTLS mode currently. Like many of the other alpha stage caveats, this shall be addressed as the feature moves toward beta status.

  6. istioctl CLI gaps: There may be some minor functional gaps in areas such as Istio CLI output displays when it comes to displaying or monitoring Istio’s ambient mode related information. These will be addressed as the feature matures.

Environment used for this guide

The examples in this guide used a deployment of Istio version 1.21.0 on a kind cluster of version 0.20.0 running Kubernetes version 1.27.3.

The examples below require a cluster with more than 1 worker node in order to explain how cross-node traffic operates. Refer to the installation user guide or getting started guide for information on installing Istio in ambient mode on a Kubernetes cluster.

Functional Overview

The functional behavior of the ztunnel proxy can be divided into its data plane behavior and its interaction with the Istio control plane. This section takes a brief look at these two aspects - detailed description of the internal design of the ztunnel proxy is out of scope for this guide.

Control plane overview

The figure shows an overview of the control plane related components and flows between ztunnel proxy and the istiod control plane.

Ztunnel architecture

Ztunnel architecture

The ztunnel proxy uses xDS APIs to communicate with the Istio control plane (istiod). This enables the fast, dynamic configuration updates required in modern distributed systems. The ztunnel proxy also obtains mTLS certificates for the Service Accounts of all pods that are scheduled on its Kubernetes node using xDS. A single ztunnel proxy may implement L4 data plane functionality on behalf of any pod sharing it’s node which requires efficiently obtaining relevant configuration and certificates. This multi-tenant architecture contrasts sharply with the sidecar model where each application pod has its own proxy.

It is also worth noting that in ambient mode, a simplified set of resources are used in the xDS APIs for ztunnel proxy configuration. This results in improved performance (having to transmit and process a much smaller set of information that is sent from istiod to the ztunnel proxies) and improved troubleshooting.

Data plane overview

This section briefly summarizes key aspects of the data plane functionality.

Ztunnel to ztunnel datapath

The first scenario is ztunnel to ztunnel L4 networking. This is depicted in the following figure.

Basic ztunnel L4-only datapath

Basic ztunnel L4-only datapath

The figure depicts ambient pod workloads running on two nodes W1 and W2 of a Kubernetes cluster. There is a single instance of the ztunnel proxy on each node. In this scenario, application client pods C1, C2 and C3 need to access a service provided by pod S1 and there is no requirement for advanced L7 features such as L7 traffic routing or L7 traffic management so no Waypoint proxy is needed.

The figure shows that pods C1 and C2 running on node W1 connect with pod S1 running on node W2 and their TCP traffic is tunneled through HBONE tunnel instances that have been created by the ztunnel proxy pods of each node. Mutual TLS (mTLS) is used for encryption as well as mutual authentication of traffic being tunneled. SPIFFE identities are used to identify the workloads on each side of the connection. The term HBONE (for HTTP Based Overlay Network Encapsulation) is used in Istio ambient to refer to a technique for transparently and securely tunneling TCP packets encapsulated within HTTPS packets. For more details on the datapath, including HBONE and the traffic redirection details, refer to the ztunnel traffic redirection guide.

Note: Although the figure shows the HBONE tunnels to be between the two ztunnel proxies, in the in-pod redirection implementation introduced in Istio 1.21.0 the tunnels are in fact between the source and destination pods. Traffic is HBONE encapsulated and encrypted in the network namespace of the source pod itself, and eventually decapsulated and decrypted in the network namespace of the destination pod on the destination worker node. The ztunnel proxy still logically handles both the control plane and data plane needed for HBONE transport, however it is able to do that from inside the network namespaces of the source and destination pods.

Note that the figure shows that local traffic - from pod C3 to destination pod S1 on worker node W2 - also traverses the local ztunnel proxy instance so that L4 traffic management functions such as L4 Authorization and L4 Telemetry are enforced identically on traffic, whether or not it crosses a node boundary.

Ztunnel datapath via waypoint

The next figure depicts the data path for a use case which requires advanced L7 traffic routing, management or policy handling. Here ztunnel uses HBONE tunneling to send traffic to a waypoint proxy for L7 processing. After processing, the waypoint sends traffic via a second HBONE tunnel to the ztunnel on the node hosting the selected service destination pod. In general the waypoint proxy may or may not be located on the same nodes as the source or destination pod.

Ztunnel datapath via an interim waypoint

Ztunnel datapath via an interim waypoint

Ztunnel datapath hair-pinning

As noted earlier, some ambient functions may change as the project moves to beta status and beyond. This feature (hair-pinning) is an example of a feature that is currently available in the alpha version of ambient and under review for possible modification as the project evolves.

It was noted earlier that traffic is always sent to a destination pod by first sending it to the ztunnel proxy on the same node as the destination pod. But what if the sender is either completely outside the Istio ambient mesh and hence does not initiate HBONE tunnels to the destination ztunnel first? What if the sender is malicious and trying to send traffic directly to an ambient pod destination, bypassing the destination ztunnel proxy?

There are two scenarios here, both of which are depicted in the following figure:

  1. Traffic stream B1 is being received by node W2 outside of any HBONE tunnel and directly addressed to ambient pod S1’s IP address for some reason (possibly because the traffic source is not an ambient pod). As shown in the figure, the ztunnel traffic redirection logic intercepts such traffic and redirects it via the local ztunnel proxy for destination-side proxy processing and possible filtering based on AuthorizationPolicy prior to sending it into pod S1.
  2. Traffic stream G1 is being received by the ztunnel proxy of node W2 (possibly over an HBONE tunnel). However, the ztunnel proxy checks that the destination service requires waypoint processing and yet the source sending this traffic is not a waypoint or is not associated with this destination service. In this case, the ztunnel proxy hairpins the traffic towards one of the waypoints associated with the destination service from where it can then be delivered to any pod implementing the destination service (possibly to pod S1 itself, as shown in the figure).

Ztunnel traffic hair-pinning

Ztunnel traffic hair-pinning

Note on HBONE

HBONE (HTTP Based Overlay Network Encapsulation) is an Istio-specific term. It refers to the use of standard HTTP tunneling via the HTTP CONNECT method to transparently tunnel application packets/ byte streams. In its current implementation within Istio, it transports TCP packets by tunneling these transparently using the HTTP CONNECT method, uses HTTP/2, with encryption and mutual authentication provided by mutual TLS and the HBONE tunnel itself runs on TCP port 15008. The overall HBONE packet format from IP layer onwards is depicted in the following figure.

HBONE L3 packet format

HBONE L3 packet format

Additional use cases of HBONE and HTTP tunneling (such as support for IPv6 and UDP packets) will be investigated in the future as ambient mode evolves.

Deploying an Application

Normally, a user with Istio admin privileges will deploy the Istio mesh infrastructure. Once Istio is successfully deployed in ambient mode, it will be transparently available to applications deployed by all users in namespaces that have been annotated to use Istio ambient as illustrated in the examples below.

Basic application deployment without Ambient

First, deploy a simple HTTP client server application without making it part of the Istio ambient mesh. Execute the following examples from the top of a local Istio repository or Istio folder created by downloading the istioctl client as described in Istio guides.

  1. $ kubectl create ns ambient-demo
  2. $ kubectl apply -f samples/httpbin/httpbin.yaml -n ambient-demo
  3. $ kubectl apply -f samples/sleep/sleep.yaml -n ambient-demo
  4. $ kubectl apply -f samples/sleep/notsleep.yaml -n ambient-demo
  5. $ kubectl scale deployment sleep --replicas=2 -n ambient-demo

These manifests deploy multiple replicas of the sleep and notsleep pods which will be used as clients for the httpbin service pod (for simplicity, the command-line outputs have been deleted in the code samples above).

  1. $ kubectl wait -n ambient-demo --for=condition=ready pod --selector=app=httpbin --timeout=90s
  2. pod/httpbin-648cd984f8-7vg8w condition met
  1. $ kubectl get pods -n ambient-demo
  2. NAME READY STATUS RESTARTS AGE
  3. httpbin-648cd984f8-7vg8w 1/1 Running 0 31m
  4. notsleep-bb6696574-2tbzn 1/1 Running 0 31m
  5. sleep-69cfb4968f-mhccl 1/1 Running 0 31m
  6. sleep-69cfb4968f-rhhhp 1/1 Running 0 31m
  1. $ kubectl get svc httpbin -n ambient-demo
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. httpbin ClusterIP 10.110.145.219 <none> 8000/TCP 28m

Note that each application pod has just 1 container running in it (the “1/1” indicator) and that httpbin is an http service listening on ClusterIP service port 8000. You should now be able to curl this service from either client pod and confirm it returns the httpbin web page as shown below. At this point there is no TLS of any form being used.

  1. $ kubectl exec deploy/sleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1
  2. <title>httpbin.org</title>

Enabling ambient for an application

You can now enable ambient for the application deployed in the prior subsection by simply adding the label istio.io/dataplane-mode=ambient to the application’s namespace as shown below. Note that this example focuses on a fresh namespace with new, sidecar-less workloads captured via ambient mode only. Later sections will describe how conflicts are resolved in hybrid scenarios that mix sidecar mode and ambient mode within the same mesh.

  1. $ kubectl label namespace ambient-demo istio.io/dataplane-mode=ambient
  2. $ kubectl get pods -n ambient-demo
  3. NAME READY STATUS RESTARTS AGE
  4. httpbin-648cd984f8-7vg8w 1/1 Running 0 78m
  5. notsleep-bb6696574-2tbzn 1/1 Running 0 77m
  6. sleep-69cfb4968f-mhccl 1/1 Running 0 78m
  7. sleep-69cfb4968f-rhhhp 1/1 Running 0 78m

Note that after ambient is enabled for the namespace, every application pod still only has 1 container, and the uptime of these pods indicates these were not restarted in order to enable ambient mode (unlike sidecar mode, which requires pods to be restarted when the sidecar proxies are injected). This results in better user experience and operational performance since ambient mode can seamlessly be enabled (or disabled) completely transparently as far as the application pods are concerned.

Initiate a curl request again from one of the client pods to the service to verify that traffic continues to flow while ambient mode.

  1. $ kubectl exec deploy/sleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1
  2. <title>httpbin.org</title>

This indicates the traffic path is working. The next section looks at how to monitor the configuration and data plane of the ztunnel proxy to confirm that traffic is correctly using the ztunnel proxy.

Monitoring the ztunnel proxy & L4 networking

This section describes some options for monitoring the ztunnel proxy configuration and data path. This information can also help with some high level troubleshooting and in identifying information that would be useful to collect and provide in a bug report if there are any problems. Additional advanced monitoring of ztunnel internals and advanced troubleshooting is out of scope for this guide.

Viewing ztunnel proxy state

As indicated previously, the ztunnel proxy on each node gets configuration and discovery information from the istiod component via xDS APIs. Use the istioctl proxy-config command shown below to view discovered workloads as seen by a ztunnel proxy as well as secrets holding the TLS certificates that the ztunnel proxy has received from the istiod control plane to use in mTLS signaling on behalf of the local workloads.

In the first example, you see all the workloads and control plane components that the specific ztunnel pod is currently tracking including information about the IP address and protocol to use when connecting to that component and whether there is a Waypoint proxy associated with that workload. This example can repeated with any of the other ztunnel pods in the system to display their current configuration.

  1. $ export ZTUNNEL=$(kubectl get pods -n istio-system -o wide | grep ztunnel -m 1 | sed 's/ .*//')
  2. $ echo "$ZTUNNEL"
  1. $ istioctl proxy-config workloads "$ZTUNNEL".istio-system
  2. NAME NAMESPACE IP NODE WAYPOINT PROTOCOL
  3. coredns-6d4b75cb6d-ptbhb kube-system 10.240.0.2 amb1-control-plane None TCP
  4. coredns-6d4b75cb6d-tv5nz kube-system 10.240.0.3 amb1-control-plane None TCP
  5. httpbin-648cd984f8-2q9bn ambient-demo 10.240.1.5 amb1-worker None HBONE
  6. httpbin-648cd984f8-7dglb ambient-demo 10.240.2.3 amb1-worker2 None HBONE
  7. istiod-5c7f79574c-pqzgc istio-system 10.240.1.2 amb1-worker None TCP
  8. local-path-provisioner-9cd9bd544-x7lq2 local-path-storage 10.240.0.4 amb1-control-plane None TCP
  9. notsleep-bb6696574-r4xjl ambient-demo 10.240.2.5 amb1-worker2 None HBONE
  10. sleep-69cfb4968f-mwglt ambient-demo 10.240.1.4 amb1-worker None HBONE
  11. sleep-69cfb4968f-qjmfs ambient-demo 10.240.2.4 amb1-worker2 None HBONE
  12. ztunnel-5jfj2 istio-system 10.240.0.5 amb1-control-plane None TCP
  13. ztunnel-gkldc istio-system 10.240.1.3 amb1-worker None TCP
  14. ztunnel-xxbgj istio-system 10.240.2.2 amb1-worker2 None TCP

In the second example, you see the list of TLS certificates that this ztunnel proxy instance has received from istiod to use in TLS signaling.

  1. $ istioctl proxy-config secrets "$ZTUNNEL".istio-system
  2. NAME TYPE STATUS VALID CERT SERIAL NUMBER NOT AFTER NOT BEFORE
  3. spiffe://cluster.local/ns/ambient-demo/sa/httpbin CA Available true edf7f040f4b4d0b75a1c9a97a9b13545 2023-09-20T19:02:00Z 2023-09-19T19:00:00Z
  4. spiffe://cluster.local/ns/ambient-demo/sa/httpbin Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z
  5. spiffe://cluster.local/ns/ambient-demo/sa/sleep CA Available true 3b9dbea3b0b63e56786a5ea170995f48 2023-09-20T19:00:44Z 2023-09-19T18:58:44Z
  6. spiffe://cluster.local/ns/ambient-demo/sa/sleep Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z
  7. spiffe://cluster.local/ns/istio-system/sa/istiod CA Available true 885ee63c08ef9f1afd258973a45c8255 2023-09-20T18:26:34Z 2023-09-19T18:24:34Z
  8. spiffe://cluster.local/ns/istio-system/sa/istiod Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z
  9. spiffe://cluster.local/ns/istio-system/sa/ztunnel CA Available true 221b4cdc4487b60d08e94dc30a0451c6 2023-09-20T18:26:35Z 2023-09-19T18:24:35Z
  10. spiffe://cluster.local/ns/istio-system/sa/ztunnel Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z

Using these CLI commands, a user can check that ztunnel proxies are getting configured with all the expected workloads and TLS certificates and missing information can be used for troubleshooting to explain any potential observed networking errors. A user may also use the all option to view all parts of the proxy-config with a single CLI command and the JSON output formatter as shown in the example below to display the complete set of available state information.

  1. $ istioctl proxy-config all "$ZTUNNEL".istio-system -o json | jq

Note that when used with a ztunnel proxy instance, not all options of the istioctl proxy-config CLI are supported since some apply only to sidecar proxies.

An advanced user may also view the raw configuration dump of a ztunnel proxy via a curl to the endpoint inside a ztunnel proxy pod as shown in the following example.

  1. $ kubectl exec ds/ztunnel -n istio-system -- curl http://localhost:15000/config_dump | jq .

Viewing Istiod state for ztunnel xDS resources

Sometimes an advanced user may want to view the state of ztunnel proxy config resources as maintained in the istiod control plane, in the format of the xDS API resources defined specially for ztunnel proxies. This can be done by exec-ing into the istiod pod and obtaining this information from port 15014 for a given ztunnel proxy as shown in the example below. This output can then also be saved and viewed with a JSON pretty print formatter utility for easier browsing (not shown in the example).

  1. $ kubectl exec -n istio-system deploy/istiod -- curl localhost:15014/debug/config_dump?proxyID="$ZTUNNEL".istio-system | jq

Verifying ztunnel traffic logs

Send some traffic from a client sleep pod to the httpbin service.

  1. $ kubectl -n ambient-demo exec deploy/sleep -- sh -c "for i in $(seq 1 10); do curl -s -I http://httpbin:8000/; done"
  2. HTTP/1.1 200 OK
  3. Server: gunicorn/19.9.0
  4. --snip--

The response displayed confirms the client pod receives responses from the service. Now check logs of the ztunnel pods to confirm the traffic was sent over the HBONE tunnel.

  1. $ kubectl -n istio-system logs -l app=ztunnel | grep -E "inbound|outbound"
  2. 2023-08-14T09:15:46.542651Z INFO outbound{id=7d344076d398339f1e51a74803d6c854}: ztunnel::proxy::outbound: proxying to 10.240.2.10:80 using node local fast path
  3. 2023-08-14T09:15:46.542882Z INFO outbound{id=7d344076d398339f1e51a74803d6c854}: ztunnel::proxy::outbound: complete dur=269.272µs
  4. --snip--

These log messages confirm the traffic indeed used the ztunnel proxy in the datapath. Additional fine-grained monitoring can be done by checking logs on the specific ztunnel proxy instances that are on the same nodes as the source and destination pods of traffic. If these logs are not seen, then a possibility is that traffic redirection may not be working correctly. Detailed description of monitoring and troubleshooting of the traffic redirection logic is out of scope for this guide. Note that as mentioned previously, with ambient traffic always traverses the ztunnel pod even when the source and destination of the traffic are on the same compute node.

Monitoring and Telemetry via Prometheus, Grafana, Kiali

In addition to checking ztunnel logs and other monitoring options noted above, one can also use normal Istio monitoring and telemetry functions to monitor application traffic within an Istio Ambient mesh. The use of Istio in ambient mode does not change this behavior. Since this functionality is largely unchanged in Istio ambient mode from Istio sidecar mode, these details are not repeated in this guide. Please refer to:

One point to note is that in case of a service that is only using ztunnel and L4 networking, the Istio metrics reported will currently only be the L4 TCP metrics (namely istio_tcp_sent_bytes_total, istio_tcp_received_bytes_total, istio_tcp_connections_opened_total, istio_tcp_connections_closed_total). The full set of Istio and Envoy metrics will be reported when a Waypoint proxy is involved.

Verifying ztunnel load balancing

The ztunnel proxy automatically performs client-side load balancing if the destination is a service with multiple endpoints. No additional configuration is needed. The ztunnel load balancing algorithm is an internally fixed L4 Round Robin algorithm that distributes traffic based on L4 connection state and is not user configurable.

If the destination is a service with multiple instances or pods and there is no Waypoint associated with the destination service, then the source ztunnel proxy performs L4 load balancing directly across these instances or service backends and then sends traffic via the remote ztunnel proxies associated with those backends. If the destination service does have a Waypoint deployment (with one or more backend instances of the Waypoint proxy) associated with it, then the source ztunnel proxy performs load balancing by distributing traffic across these Waypoint proxies and sends traffic via the remote ztunnel proxies associated with the Waypoint proxy instances.

Now repeat the previous example with multiple replicas of the service pod and verify that client traffic is load balanced across the service replicas. Wait for all pods in the ambient-demo namespace to go into Running state before continuing to the next step.

  1. $ kubectl -n ambient-demo scale deployment httpbin --replicas=2 ; kubectl wait --for condition=available deployment/httpbin -n ambient-demo
  2. deployment.apps/httpbin scaled
  3. deployment.apps/httpbin condition met
  1. $ kubectl -n ambient-demo exec deploy/sleep -- sh -c "for i in $(seq 1 10); do curl -s -I http://httpbin:8000/; done"
  1. $ kubectl -n istio-system logs -l app=ztunnel | grep -E "inbound|outbound"
  2. --snip--
  3. 2023-08-14T09:33:24.969996Z INFO inbound{id=ec177a563e4899869359422b5cdd1df4 peer_ip=10.240.2.16 peer_id=spiffe://cluster.local/ns/ambient-demo/sa/sleep}: ztunnel::proxy::inbound: got CONNECT request to 10.240.1.11:80
  4. 2023-08-14T09:33:25.028601Z INFO inbound{id=1ebc3c7384ee68942bbb7c7ed866b3d9 peer_ip=10.240.2.16 peer_id=spiffe://cluster.local/ns/ambient-demo/sa/sleep}: ztunnel::proxy::inbound: got CONNECT request to 10.240.1.11:80
  5. --snip--
  6. 2023-08-14T09:33:25.226403Z INFO outbound{id=9d99723a61c9496532d34acec5c77126}: ztunnel::proxy::outbound: proxy to 10.240.1.11:80 using HBONE via 10.240.1.11:15008 type Direct
  7. 2023-08-14T09:33:25.273268Z INFO outbound{id=9d99723a61c9496532d34acec5c77126}: ztunnel::proxy::outbound: complete dur=46.9099ms
  8. 2023-08-14T09:33:25.276519Z INFO outbound{id=cc87b4de5ec2ccced642e22422ca6207}: ztunnel::proxy::outbound: proxying to 10.240.2.10:80 using node local fast path
  9. 2023-08-14T09:33:25.276716Z INFO outbound{id=cc87b4de5ec2ccced642e22422ca6207}: ztunnel::proxy::outbound: complete dur=231.892µs
  10. --snip--

Here note the logs from the ztunnel proxies first indicating the http CONNECT request to the new destination pod (10.240.1.11) which indicates the setup of the HBONE tunnel to ztunnel on the node hosting the additional destination service pod. This is then followed by logs indicating the client traffic being sent to both 10.240.1.11 and 10.240.2.10 which are the two destination pods providing the service. Also note that the data path is performing client-side load balancing in this case and not depending on Kubernetes service load balancing. In your setup these numbers will be different and will match the pod addresses of the httpbin pods in your cluster.

This is a round robin load balancing algorithm and is separate from and independent of any load balancing algorithm that may be configured within a VirtualService’s TrafficPolicy field, since as discussed previously, all aspects of VirtualService API objects are instantiated on the Waypoint proxies and not the ztunnel proxies.

Pod selection logic for ambient and sidecar modes

Istio with sidecar proxies can co-exist with ambient based node level proxies within the same compute cluster. It is important to ensure that the same pod or namespace does not get configured to use both a sidecar proxy and an ambient node-level proxy. However, if this does occur, currently sidecar injection takes precedence for such a pod or namespace.

Note that two pods within the same namespace could in theory be set to use different modes by labeling individual pods separately from the namespace label, however this is not recommended. For most common use cases it is recommended that a single mode be used for all pods within a single namespace.

The exact logic to determine whether a pod is set up to use ambient mode is as follows.

  1. The istio-cni plugin configuration exclude list configured in cni.values.excludeNamespaces is used to skip namespaces in the exclude list.

  2. ambient mode is used for a pod if

    • The namespace has label istio.io/dataplane-mode=ambient
    • The annotation sidecar.istio.io/status is not present on the pod
    • ambient.istio.io/redirection is not disabled

The simplest option to avoid a configuration conflict is for a user to ensure that for each namespace, it either has the label for sidecar injection (istio-injection=enabled) or for ambient data plane mode (istio.io/dataplane-mode=ambient) but never both.

L4 Authorization Policy

As mentioned previously, the ztunnel proxy performs Authorization policy enforcement when it requires only L4 traffic processing in order to enforce the policy in the data plane and there are no Waypoints involved. The actual enforcement point is at the receiving (or server side) ztunnel proxy in the path of a connection.

Apply a basic L4 Authorization policy for the already deployed httpbin application as shown in the example below.

  1. $ kubectl apply -n ambient-demo -f - <<EOF
  2. apiVersion: security.istio.io/v1beta1
  3. kind: AuthorizationPolicy
  4. metadata:
  5. name: allow-sleep-to-httpbin
  6. spec:
  7. selector:
  8. matchLabels:
  9. app: httpbin
  10. action: ALLOW
  11. rules:
  12. - from:
  13. - source:
  14. principals:
  15. - cluster.local/ns/ambient-demo/sa/sleep
  16. EOF

The behavior of the AuthorizationPolicy API has the same functional behavior in Istio ambient mode as in sidecar mode. When there is no AuthorizationPolicy provisioned, then the default action is ALLOW. Once the policy above is provisioned, pods matching the selector in the policy (i.e. app:httpbin) only allow traffic explicitly whitelisted which in this case is sources with principal (i.e. identity) of cluster.local/ns/ambient-demo/sa/sleep. Now as shown below, if you try the curl operation to the httpbin service from the sleep pods, it still works but the same operation is blocked when initiated from the notsleep pods.

Note that this policy performs an explicit ALLOW action on traffic from sources with principal (i.e. identity) of cluster.local/ns/ambient-demo/sa/sleep and hence traffic from all other sources will be denied.

  1. $ kubectl exec deploy/sleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1
  2. <title>httpbin.org</title>
  1. $ kubectl exec deploy/notsleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1
  2. command terminated with exit code 56

Note that there are no waypoint proxies deployed and yet this AuthorizationPolicy is getting enforced and this is because this policy only requires L4 traffic processing which can be performed by ztunnel proxies. These policy actions can be further confirmed by checking ztunnel logs and looking for logs that indicate RBAC actions as shown in the following example.

  1. $ kubectl logs ds/ztunnel -n istio-system | grep -E RBAC
  2. -- snip --
  3. 2023-10-10T23:14:00.534962Z INFO inbound{id=cc493da5e89877489a786fd3886bd2cf peer_ip=10.240.2.2 peer_id=spiffe://cluster.local/ns/ambient-demo/sa/notsleep}: ztunnel::proxy::inbound: RBAC rejected conn=10.240.2.2(spiffe://cluster.local/ns/ambient-demo/sa/notsleep)->10.240.1.2:80
  4. 2023-10-10T23:15:33.339867Z INFO inbound{id=4c4de8de802befa5da58a165a25ff88a peer_ip=10.240.2.2 peer_id=spiffe://cluster.local/ns/ambient-demo/sa/notsleep}: ztunnel::proxy::inbound: RBAC rejected conn=10.240.2.2(spiffe://cluster.local/ns/ambient-demo/sa/notsleep)->10.240.1.2:80

If an AuthorizationPolicy has been configured that requires any traffic processing beyond L4, and if no waypoint proxies are configured for the destination of the traffic, then ztunnel proxy will simply drop all traffic as a defensive move. Hence, check to ensure that either all rules involve L4 processing only or else if non-L4 rules are unavoidable, then waypoint proxies are also configured to handle policy enforcement.

As an example, modify the AuthorizationPolicy to include a check for the HTTP GET method as shown below. Now notice that both sleep and notsleep pods are blocked from sending traffic to the destination httpbin service.

  1. $ kubectl apply -n ambient-demo -f - <<EOF
  2. apiVersion: security.istio.io/v1beta1
  3. kind: AuthorizationPolicy
  4. metadata:
  5. name: allow-sleep-to-httpbin
  6. spec:
  7. selector:
  8. matchLabels:
  9. app: httpbin
  10. action: ALLOW
  11. rules:
  12. - from:
  13. - source:
  14. principals:
  15. - cluster.local/ns/ambient-demo/sa/sleep
  16. to:
  17. - operation:
  18. methods: ["GET"]
  19. EOF
  1. $ kubectl exec deploy/sleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1
  2. command terminated with exit code 56
  1. $ kubectl exec deploy/notsleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1
  2. command terminated with exit code 56

You can also confirm by viewing logs of specific ztunnel proxy pods (not shown in the example here) that it is always the ztunnel proxy on the node hosting the destination pod that actually enforces the policy.

Go ahead and delete this AuthorizationPolicy before continuing with the rest of the examples in the guide.

  1. $ kubectl delete AuthorizationPolicy allow-sleep-to-httpbin -n ambient-demo

Ambient Interoperability with non-ambient endpoints

In the use cases so far, the traffic source and destination pods are both ambient pods. This section covers some mixed use cases where ambient endpoints need to communicate with non-ambient endpoints. As with prior examples in this guide, this section covers use cases that do not require waypoint proxies.

  1. East-West non-mesh pod to ambient mesh pod (and use of PeerAuthentication resource)
  2. East-West Istio sidecar proxy pod to ambient mesh pod
  3. North-South Ingress Gateway to ambient backend pods

East-West non-mesh pod to ambient mesh pod (and use of PeerAuthentication resource)

In the example below, the same httpbin service which has already been set up in the prior examples is accessed via client sleep pods that are running in a separate namespace that is not part of the Istio mesh. This example shows that East-west traffic between ambient mesh pods and non mesh pods is seamlessly supported. Note that as described previously, this use case leverages the traffic hair-pinning capability of ambient. Since the non-mesh pods initiate traffic directly to the backend pods without going through HBONE or ztunnel, at the destination node, traffic is redirected via the ztunnel proxy at the destination node to ensure that ambient authorization policy is applied (this can be verified by viewing logs of the appropriate ztunnel proxy pod on the destination node; the logs are not shown in the example snippet below for simplicity).

  1. $ kubectl create namespace client-a
  2. $ kubectl apply -f samples/sleep/sleep.yaml -n client-a
  3. $ kubectl wait --for condition=available deployment/sleep -n client-a

Wait for the pods to get to Running state in the client-a namespace before continuing.

  1. $ kubectl exec deploy/sleep -n client-a -- curl httpbin.ambient-demo.svc.cluster.local:8000 -s | grep title -m 1
  2. <title>httpbin.org</title>

As shown in the example below, now add a PeerAuthentication resource with mTLS mode set to STRICT, in the ambient namespace and confirm that the same client’s traffic is now rejected with an error indicating the request was rejected. This is because the client is using simple HTTP to connect to the server instead of an HBONE tunnel with mTLS. This is a possible method that can be used to prevent non-Istio sources from sending traffic to Istio ambient pods.

  1. $ kubectl apply -n ambient-demo -f - <<EOF
  2. apiVersion: security.istio.io/v1beta1
  3. kind: PeerAuthentication
  4. metadata:
  5. name: peerauth
  6. spec:
  7. mtls:
  8. mode: STRICT
  9. EOF
  1. $ kubectl exec deploy/sleep -n client-a -- curl httpbin.ambient-demo.svc.cluster.local:8000 -s | grep title -m 1
  2. command terminated with exit code 56

Change the mTLS mode to PERMISSIVE and confirm that the ambient pods can once again accept non-mTLS connections including from non-mesh pods in this case.

  1. $ kubectl apply -n ambient-demo -f - <<EOF
  2. apiVersion: security.istio.io/v1beta1
  3. kind: PeerAuthentication
  4. metadata:
  5. name: peerauth
  6. spec:
  7. mtls:
  8. mode: PERMISSIVE
  9. EOF
  1. $ kubectl exec deploy/sleep -n client-a -- curl httpbin.ambient-demo.svc.cluster.local:8000 -s | grep title -m 1
  2. <title>httpbin.org</title>

East-West Istio sidecar proxy pod to ambient mesh pod

This use case is that of seamless East-West traffic interoperability between an Istio pod using a sidecar proxy and an ambient pod within the same mesh.

The same httpbin service from the previous example is used but now add a client to access this service from another namespace which is labeled for sidecar injection. This also works automatically and transparently as shown in the example below. In this case the sidecar proxy running with the client automatically knows to use the HBONE control plane since the destination has been discovered to be an HBONE destination. The user does not need to do any special configuration to enable this.

For sidecar proxies to use the HBONE/mTLS signaling option when communicating with ambient destinations, they need to be configured with ISTIO_META_ENABLE_HBONE set to true in the proxy metadata. This is automatically set for the user as default in the MeshConfig when using the ambient profile, hence the user does not need to do anything additional when using this profile.

  1. $ kubectl create ns client-b
  2. $ kubectl label namespace client-b istio-injection=enabled
  3. $ kubectl apply -f samples/sleep/sleep.yaml -n client-b
  4. $ kubectl wait --for condition=available deployment/sleep -n client-b

Wait for the pods to get to Running state in the client-b namespace before continuing.

  1. $ kubectl exec deploy/sleep -n client-b -- curl httpbin.ambient-demo.svc.cluster.local:8000 -s | grep title -m 1
  2. <title>httpbin.org</title>

Again, it can further be verified from viewing the logs of the ztunnel pod (not shown in the example) at the destination node that traffic does in fact use the HBONE and CONNECT based path from the sidecar proxy based source client pod to the ambient based destination service pod. Additionally, although not shown, it can also be verified that unlike the previous subsection, in this case even if you apply a PeerAuthentication resource to the namespace tagged for ambient mode, communication continues between client and service pods since both use the HBONE control and data planes relying on mTLS.

North-South Ingress Gateway to ambient backend pods

This section describes a use case for North-South traffic with an Istio Gateway exposing the httpbin service via the Kubernetes Gateway API. The gateway itself is running in a non-Ambient namespace and may be an existing gateway that is also exposing other services that are provided by non-ambient pods. Hence, this example shows that ambient workloads can also interoperate with Istio gateways that need not themselves be running in namespaces tagged for ambient mode of operation.

For this example, you can use metallb to provide a load balancer service on an IP addresses that is reachable from outside the cluster. The same example also works with other forms of North-South load balancing options. The example below assumes that you have already installed metallb in this cluster to provide the load balancer service including a pool of IP addresses for metallb to use for exposing services externally. Refer to the metallb guide for kind for instructions on setting up metallb on kind clusters or refer to the instructions from the metallb documentation appropriate for your environment.

This example uses the Kubernetes Gateway API for configuring the N-S gateway. Since this API is not currently provided as default in Kubernetes and kind distributions, you have to install the API CRDs first as shown in the example.

An instance of Gateway using the Kubernetes Gateway API CRDs will then be deployed to leverage this metallb load balancer service. The instance of Gateway runs in the istio-system namespace in this example to represent an existing Gateway running in a non-ambient namespace. Finally, an HTTPRoute will be provisioned with a backend reference pointing to the existing httpbin service that is running on an ambient pod in the ambient-demo namespace.

  1. $ kubectl get crd gateways.gateway.networking.k8s.io &> /dev/null || \
  2. { kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd/experimental?ref=v0.6.1" | kubectl apply -f -; }

Istio includes beta support for the Kubernetes Gateway API and intends to make it the default API for traffic management in the future. The following instructions allow you to choose to use either the Gateway API or the Istio configuration API when configuring traffic management in the mesh. Follow instructions under either the Gateway API or Istio APIs tab, according to your preference.

  1. $ kubectl apply -f - << EOF
  2. apiVersion: gateway.networking.k8s.io/v1beta1
  3. kind: Gateway
  4. metadata:
  5. name: httpbin-gateway
  6. namespace: istio-system
  7. spec:
  8. gatewayClassName: istio
  9. listeners:
  10. - name: http
  11. port: 80
  12. protocol: HTTP
  13. allowedRoutes:
  14. namespaces:
  15. from: All
  16. EOF
  1. $ kubectl apply -n ambient-demo -f - << EOF
  2. apiVersion: gateway.networking.k8s.io/v1beta1
  3. kind: HTTPRoute
  4. metadata:
  5. name: httpbin
  6. spec:
  7. parentRefs:
  8. - name: httpbin-gateway
  9. namespace: istio-system
  10. rules:
  11. - backendRefs:
  12. - name: httpbin
  13. port: 8000
  14. EOF

Next find the external service IP address on which the Gateway is listening and then access the httpbin service on this IP address (172.18.255.200 in the example below) from outside the cluster as shown below.

  1. $ kubectl get service httpbin-gateway-istio -n istio-system
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. httpbin-gateway-istio LoadBalancer 10.110.30.25 172.18.255.200 15021:32272/TCP,80:30159/TCP 121m
  1. $ export INGRESS_HOST=$(kubectl -n istio-system get service httpbin-gateway-istio -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
  2. $ echo "$INGRESS_HOST"
  3. 172.18.255.200
  1. $ curl "$INGRESS_HOST" -s | grep title -m 1
  2. <title>httpbin.org</title>

These examples illustrate multiple options for interoperability between ambient pods and non-ambient endpoints (which can be either Kubernetes application pods or Istio gateway pods with both Istio native gateways and Kubernetes Gateway API instances). Interoperability is also supported between Istio ambient pods and Istio Egress Gateways as well as scenarios where the ambient pods run the client-side of an application with the service side running outside of the mesh of on a mesh pod that uses the sidecar proxy mode. Hence, users have multiple options for seamlessly integrating ambient and non-ambient workloads within the same Istio mesh, allowing for phased introduction of ambient capability as best suits the needs of Istio mesh deployments and operations.