Installation on Azure AKS

Installation on Azure AKS

This guide covers installing Cilium into an Azure AKS environment using Azure IPAM.

Create an Azure Kubernetes cluster

Setup a Kubernetes cluster on Azure. You can use any method available as long as your Kubernetes cluster has CNI enabled in the kubelet configuration. For simplicity of this guide, we will set up a managed AKS cluster:

Prerequisites

Ensure that you have the Azure Cloud CLI installed.

To verify, confirm that the following command displays the set of available Kubernetes versions.

az aks get-versions -l westus -o table

Deploy the Cluster

Note

Do NOT specify the ‘–network-policy’ flag when creating the cluster, as this will cause the Azure CNI plugin to push down unwanted iptables rules.

export RESOURCE_GROUP_NAME=aks-test
export CLUSTER_NAME=aks-test
export LOCATION=westus
az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
az aks create \
    --resource-group $RESOURCE_GROUP_NAME \
    --name $CLUSTER_NAME \
    --location $LOCATION \
    --node-count 2 \
    --network-plugin azure

Note

When setting up AKS, it is important to use the flag --network-plugin azure to ensure that CNI mode is enabled.

Create a service principal for cilium-operator

In order to allow cilium-operator to interact with the Azure API, a service principal is required. You can reuse an existing service principal if you want but it is recommended to create a dedicated service principal for cilium-operator:

az ad sp create-for-rbac --name cilium-operator > azure-sp.json

The contents of azure-sp.json should look like this:

{
  "appId": "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa",
  "displayName": "cilium-operator",
  "name": "http://cilium-operator",
  "password": "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb",
  "tenant": "cccccccc-cccc-cccc-cccc-cccccccccccc"
}

Extract the relevant credentials to access the Azure API:

AZURE_SUBSCRIPTION_ID="$(az account show | jq -r .id)"
AZURE_CLIENT_ID="$(jq -r .appId < azure-sp.json)"
AZURE_CLIENT_SECRET="$(jq -r .password < azure-sp.json)"
AZURE_TENANT_ID="$(jq -r .tenant < azure-sp.json)"
AZURE_NODE_RESOURCE_GROUP="$(az aks show --resource-group $RESOURCE_GROUP_NAME --name $CLUSTER_NAME | jq -r .nodeResourceGroup)"

Note

AZURE_NODE_RESOURCE_GROUP must be set to the resource group of the node pool, not the resource group of the AKS cluster.

Retrieve Credentials to access cluster

az aks get-credentials --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP_NAME

Deploy Cilium

Note

First, make sure you have Helm 3 installed. Helm 2 is no longer supported.

Setup Helm repository:

helm repo add cilium https://helm.cilium.io/

Deploy Cilium release via Helm:

helm install cilium cilium/cilium --version 1.9.8 \
  --namespace kube-system \
  --set azure.enabled=true \
  --set azure.resourceGroup=$AZURE_NODE_RESOURCE_GROUP \
  --set azure.subscriptionID=$AZURE_SUBSCRIPTION_ID \
  --set azure.tenantID=$AZURE_TENANT_ID \
  --set azure.clientID=$AZURE_CLIENT_ID \
  --set azure.clientSecret=$AZURE_CLIENT_SECRET \
  --set tunnel=disabled \
  --set ipam.mode=azure \
  --set masquerade=false \
  --set nodeinit.enabled=true

Restart unmanaged Pods

If you did not use the nodeinit.restartPods=true in the Helm options when deploying Cilium, then unmanaged pods need to be restarted manually. Restart all already running pods which are not running in host-networking mode to ensure that Cilium starts managing them. This is required to ensure that all pods which have been running before Cilium was deployed have network connectivity provided by Cilium and NetworkPolicy applies to them:

kubectl get pods --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,HOSTNETWORK:.spec.hostNetwork --no-headers=true | grep '<none>' | awk '{print "-n "$1" "$2}' | xargs -L 1 -r kubectl delete pod
pod "event-exporter-v0.2.3-f9c896d75-cbvcz" deleted
pod "fluentd-gcp-scaler-69d79984cb-nfwwk" deleted
pod "heapster-v1.6.0-beta.1-56d5d5d87f-qw8pv" deleted
pod "kube-dns-5f8689dbc9-2nzft" deleted
pod "kube-dns-5f8689dbc9-j7x5f" deleted
pod "kube-dns-autoscaler-76fcd5f658-22r72" deleted
pod "kube-state-metrics-7d9774bbd5-n6m5k" deleted
pod "l7-default-backend-6f8697844f-d2rq2" deleted
pod "metrics-server-v0.3.1-54699c9cc8-7l5w2" deleted

Note

This may error out on macOS due to -r being unsupported by xargs. In this case you can safely run this command without -r with the symptom that this will hang if there are no pods to restart. You can stop this with ctrl-c.

Validate the Installation

You can monitor as Cilium and all required components are being installed:

kubectl -n kube-system get pods --watch
NAME                                    READY   STATUS              RESTARTS   AGE
cilium-operator-cb4578bc5-q52qk         0/1     Pending             0          8s
cilium-s8w5m                            0/1     PodInitializing     0          7s
coredns-86c58d9df4-4g7dd                0/1     ContainerCreating   0          8m57s
coredns-86c58d9df4-4l6b2                0/1     ContainerCreating   0          8m57s

It may take a couple of minutes for all components to come up:

cilium-operator-cb4578bc5-q52qk         1/1     Running   0          4m13s
cilium-s8w5m                            1/1     Running   0          4m12s
coredns-86c58d9df4-4g7dd                1/1     Running   0          13m
coredns-86c58d9df4-4l6b2                1/1     Running   0          13m

Deploy the connectivity test

You can deploy the “connectivity-check” to test connectivity between pods. It is recommended to create a separate namespace for this.

kubectl create ns cilium-test

Deploy the check with:

kubectl apply -n cilium-test -f https://raw.githubusercontent.com/cilium/cilium/v1.9/examples/kubernetes/connectivity-check/connectivity-check.yaml

It will deploy a series of deployments which will use various connectivity paths to connect to each other. Connectivity paths include with and without service load-balancing and various network policy combinations. The pod name indicates the connectivity variant and the readiness and liveness gate indicates success or failure of the test:

$ kubectl get pods -n cilium-test
NAME                                                     READY   STATUS    RESTARTS   AGE
echo-a-76c5d9bd76-q8d99                                  1/1     Running   0          66s
echo-b-795c4b4f76-9wrrx                                  1/1     Running   0          66s
echo-b-host-6b7fc94b7c-xtsff                             1/1     Running   0          66s
host-to-b-multi-node-clusterip-85476cd779-bpg4b          1/1     Running   0          66s
host-to-b-multi-node-headless-dc6c44cb5-8jdz8            1/1     Running   0          65s
pod-to-a-79546bc469-rl2qq                                1/1     Running   0          66s
pod-to-a-allowed-cnp-58b7f7fb8f-lkq7p                    1/1     Running   0          66s
pod-to-a-denied-cnp-6967cb6f7f-7h9fn                     1/1     Running   0          66s
pod-to-b-intra-node-nodeport-9b487cf89-6ptrt             1/1     Running   0          65s
pod-to-b-multi-node-clusterip-7db5dfdcf7-jkjpw           1/1     Running   0          66s
pod-to-b-multi-node-headless-7d44b85d69-mtscc            1/1     Running   0          66s
pod-to-b-multi-node-nodeport-7ffc76db7c-rrw82            1/1     Running   0          65s
pod-to-external-1111-d56f47579-d79dz                     1/1     Running   0          66s
pod-to-external-fqdn-allow-google-cnp-78986f4bcf-btjn7   1/1     Running   0          66s

Note

If you deploy the connectivity check to a single node cluster, pods that check multi-node functionalities will remain in the Pending state. This is expected since these pods need at least 2 nodes to be scheduled successfully.

Specify Environment Variables

Specify the namespace in which Cilium is installed as CILIUM_NAMESPACE environment variable. Subsequent commands reference this environment variable.

export CILIUM_NAMESPACE=kube-system

Enable Hubble for Cluster-Wide Visibility

Hubble is the component for observability in Cilium. To obtain cluster-wide visibility into your network traffic, deploy Hubble Relay and the UI as follows on your existing installation:

Installation via Helm

Installation via quick-hubble-install.yaml

If you installed Cilium via helm install, you may enable Hubble Relay and UI with the following command:

helm upgrade cilium cilium/cilium --version 1.9.8 \
   --namespace $CILIUM_NAMESPACE \
   --reuse-values \
   --set hubble.listenAddress=":4244" \
   --set hubble.relay.enabled=true \
   --set hubble.ui.enabled=true

On Cilium 1.9.1 and older, the Cilium agent pods will be restarted in the process.

If you installed Cilium 1.9.2 or newer via the provided quick-install.yaml, you may deploy Hubble Relay and UI on top of your existing installation with the following command:

kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.9/install/kubernetes/quick-hubble-install.yaml

Installation via quick-hubble-install.yaml only works if the installed Cilium version is 1.9.2 or newer. Users of Cilium 1.9.0 or 1.9.1 are encouraged to upgrade to a newer version by applying the most recent Cilium quick-install.yaml first.

Alternatively, it is possible to manually generate a YAML manifest for the Cilium DaemonSet and Hubble Relay/UI as follows. The generated YAML can be applied on top of an existing installation:

# Set this to your installed Cilium version
export CILIUM_VERSION=1.9.1
# Please set any custom Helm values you may need for Cilium,
# such as for example `--set operator.replicas=1` on single-cluster nodes.
helm template cilium cilium/cilium --version $CILIUM_VERSION \\
   --namespace $CILIUM_NAMESPACE \\
   --set hubble.tls.auto.method="cronJob" \\
   --set hubble.listenAddress=":4244" \\
   --set hubble.relay.enabled=true \\
   --set hubble.ui.enabled=true > cilium-with-hubble.yaml
# This will modify your existing Cilium DaemonSet and ConfigMap
kubectl apply -f cilium-with-hubble.yaml

The Cilium agent pods will be restarted in the process.

Once the Hubble UI pod is started, use port forwarding for the hubble-ui service. This allows opening the UI locally on a browser:

kubectl port-forward -n $CILIUM_NAMESPACE svc/hubble-ui --address 0.0.0.0 --address :: 12000:80

And then open http://localhost:12000/ to access the UI.

Hubble UI is not the only way to get access to Hubble data. A command line tool, the Hubble CLI, is also available. It can be installed by following the instructions below:

Linux

MacOS

Windows

Download the latest hubble release:

export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
curl -LO "https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-amd64.tar.gz"
curl -LO "https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-amd64.tar.gz.sha256sum"
sha256sum --check hubble-linux-amd64.tar.gz.sha256sum
tar zxf hubble-linux-amd64.tar.gz

and move the hubble CLI to a directory listed in the $PATH environment variable. For example:

sudo mv hubble /usr/local/bin

Download the latest hubble release:

export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
curl -LO "https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-darwin-amd64.tar.gz"
curl -LO "https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-darwin-amd64.tar.gz.sha256sum"
shasum -a 256 -c hubble-darwin-amd64.tar.gz.sha256sum
tar zxf hubble-darwin-amd64.tar.gz

and move the hubble CLI to a directory listed in the $PATH environment variable. For example:

sudo mv hubble /usr/local/bin

Download the latest hubble release:

curl -LO "https://raw.githubusercontent.com/cilium/hubble/master/stable.txt"
set /p HUBBLE_VERSION=<stable.txt
curl -LO "https://github.com/cilium/hubble/releases/download/%HUBBLE_VERSION%/hubble-windows-amd64.tar.gz"
curl -LO "https://github.com/cilium/hubble/releases/download/%HUBBLE_VERSION%/hubble-windows-amd64.tar.gz.sha256sum"
certutil -hashfile hubble-windows-amd64.tar.gz SHA256
type hubble-windows-amd64.tar.gz.sha256sum
:: verify that the checksum from the two commands above match
tar zxf hubble-windows-amd64.tar.gz

and move the hubble.exe CLI to a directory listed in the %PATH% environment variable after extracting it from the tarball.

Similarly to the UI, use port forwarding for the hubble-relay service to make it available locally:

kubectl port-forward -n $CILIUM_NAMESPACE svc/hubble-relay --address 0.0.0.0 --address :: 4245:80

In a separate terminal window, run the hubble status command specifying the Hubble Relay address:

$ hubble --server localhost:4245 status
Healthcheck (via localhost:4245): Ok
Current/Max Flows: 5455/16384 (33.29%)
Flows/s: 11.30
Connected Nodes: 4/4

If Hubble Relay reports that all nodes are connected, as in the example output above, you can now use the CLI to observe flows of the entire cluster:

hubble --server localhost:4245 observe

If you encounter any problem at this point, you may seek help on Slack.

Tip

Hubble CLI configuration can be persisted using a configuration file or environment variables. This avoids having to specify options specific to a particular environment every time a command is run. Run hubble help config for more information.

For more information about Hubble and its components, see the Observability section.

Limitations

All VMs and VM scale sets used in a cluster must belong to the same resource group.

Troubleshooting

If kubectl exec to a pod fails to connect, restarting the tunnelfront pod may help.
Pods may fail to gain a .spec.hostNetwork status even if restarted and managed by Cilium.
If some connectivity tests fail to reach the ready state you may need to restart the unmanaged pods again.
Some connectivity tests may fail. This is being tracked in Cilium GitHub issue #12113.
- hubble observe may report one or more nodes being unavailable and hubble-ui may fail to connect to the backends.