Installation using Helm

This guide will show you how to install Cilium using Helm. This involves a couple of additional steps compared to the Quick Installation and requires you to manually select the best datapath and IPAM mode for your particular environment.

Install Cilium

Note

Make sure you have Helm 3 installed. Helm 2 is no longer supported.

Setup Helm repository:

  1. helm repo add cilium https://helm.cilium.io/

Generic

GKE

AKS (BYOCNI)

AKS (Azure IPAM)

EKS

OpenShift

RKE

k3s

These are the generic instructions on how to install Cilium into any Kubernetes cluster using the default configuration options below. Please see the other tabs for distribution/platform specific instructions which also list the ideal default configuration for particular platforms.

Default Configuration:

DatapathIPAMDatastore
EncapsulationCluster PoolKubernetes CRD

Requirements:

Tip

See System Requirements for more details on the system requirements.

Install Cilium:

Deploy Cilium release via Helm:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace kube-system

To install Cilium on Google Kubernetes Engine (GKE), perform the following steps:

Default Configuration:

DatapathIPAMDatastore
Direct RoutingKubernetes PodCIDRKubernetes CRD

Requirements:

  • The cluster should be created with the taint node.cilium.io/agent-not-ready=true:NoExecute using --node-taints option. However, there are other options. Please make sure to read and understand the documentation page on taint effects and unmanaged pods.

Install Cilium:

Extract the Cluster CIDR to enable native-routing:

  1. NATIVE_CIDR="$(gcloud container clusters describe "${NAME}" --zone "${ZONE}" --format 'value(clusterIpv4Cidr)')"
  2. echo $NATIVE_CIDR

Deploy Cilium release via Helm:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace kube-system \
  3. --set nodeinit.enabled=true \
  4. --set nodeinit.reconfigureKubelet=true \
  5. --set nodeinit.removeCbrBridge=true \
  6. --set cni.binPath=/home/kubernetes/bin \
  7. --set gke.enabled=true \
  8. --set ipam.mode=kubernetes \
  9. --set ipv4NativeRoutingCIDR=$NATIVE_CIDR

The NodeInit DaemonSet is required to prepare the GKE nodes as nodes are added to the cluster. The NodeInit DaemonSet will perform the following actions:

  • Reconfigure kubelet to run in CNI mode
  • Mount the eBPF filesystem

To install Cilium on Azure Kubernetes Service (AKS) in Bring your own CNI mode, perform the following steps:

Default Configuration:

DatapathIPAMDatastore
EncapsulationCluster PoolKubernetes CRD

Note

BYOCNI is the preferred way to run Cilium on AKS, however integration with the Azure stack via the Azure IPAM is not available. If you require Azure IPAM, refer to the AKS (Azure IPAM) installation.

Requirements:

  • The AKS cluster must be created with --network-plugin none (BYOCNI). See the Bring your own CNI documentation for more details about BYOCNI prerequisites / implications.

Install Cilium:

Deploy Cilium release via Helm:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace kube-system \
  3. --set aksbyocni.enabled=true \
  4. --set nodeinit.enabled=true

To install Cilium on Azure Kubernetes Service (AKS) with Azure integration via Azure IPAM, perform the following steps:

Default Configuration:

DatapathIPAMDatastore
Direct RoutingAzure IPAMKubernetes CRD

Note

Azure IPAM offers integration with the Azure stack but is not the preferred way to run Cilium on AKS. If you do not require Azure IPAM, we recommend you to switch to the AKS (BYOCNI) installation.

Tip

If you want to chain Cilium on top of the Azure CNI, refer to the guide Azure CNI.

Requirements:

  • The AKS cluster must be created with --network-plugin azure for compatibility with Cilium. The Azure network plugin will be replaced with Cilium by the installer.

Limitations:

  • All VMs and VM scale sets used in a cluster must belong to the same resource group.

  • Adding new nodes to node pools might result in application pods being scheduled on the new nodes before Cilium is ready to properly manage them. The only way to fix this is either by making sure application pods are not scheduled on new nodes before Cilium is ready, or by restarting any unmanaged pods on the nodes once Cilium is ready.

    Ideally we would recommend node pools should be tainted with node.cilium.io/agent-not-ready=true:NoExecute to ensure application pods will only be scheduled/executed once Cilium is ready to manage them (see Considerations on node pool taints and unmanaged pods for more details), however this is not an option on AKS clusters:

    • It is not possible to assign custom node taints such as node.cilium.io/agent-not-ready=true:NoExecute to system node pools, cf. Azure/AKS#2578: only CriticalAddonsOnly=true:NoSchedule is available for our use case. To make matters worse, it is not possible to assign taints to the initial node pool created for new AKS clusters, cf. Azure/AKS#1402.
    • Custom node taints on user node pools cannot be properly managed at will anymore, cf. Azure/AKS#2934.
    • These issues prevent usage of our previously recommended scenario via replacement of initial system node pool with CriticalAddonsOnly=true:NoSchedule and usage of additional user node pools with node.cilium.io/agent-not-ready=true:NoExecute.

    We do not have a standard and foolproof alternative to recommend, hence the only solution is to craft a custom mechanism that will work in your environment to handle this scenario when adding new nodes to AKS clusters.

Create a Service Principal:

In order to allow cilium-operator to interact with the Azure API, a Service Principal with Contributor privileges over the AKS cluster is required (see Azure IPAM required privileges for more details). It is recommended to create a dedicated Service Principal for each Cilium installation with minimal privileges over the AKS node resource group:

  1. AZURE_SUBSCRIPTION_ID=$(az account show --query "id" --output tsv)
  2. AZURE_NODE_RESOURCE_GROUP=$(az aks show --resource-group ${RESOURCE_GROUP} --name ${CLUSTER_NAME} --query "nodeResourceGroup" --output tsv)
  3. AZURE_SERVICE_PRINCIPAL=$(az ad sp create-for-rbac --scopes /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_NODE_RESOURCE_GROUP} --role Contributor --output json --only-show-errors)
  4. AZURE_TENANT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.tenant')
  5. AZURE_CLIENT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.appId')
  6. AZURE_CLIENT_SECRET=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.password')

Note

The AZURE_NODE_RESOURCE_GROUP node resource group is not the resource group of the AKS cluster. A single resource group may hold multiple AKS clusters, but each AKS cluster regroups all resources in an automatically managed secondary resource group. See Why are two resource groups created with AKS? for more details.

This ensures the Service Principal only has privileges over the AKS cluster itself and not any other resources within the resource group.

Install Cilium:

Deploy Cilium release via Helm:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace kube-system \
  3. --set azure.enabled=true \
  4. --set azure.resourceGroup=$AZURE_NODE_RESOURCE_GROUP \
  5. --set azure.subscriptionID=$AZURE_SUBSCRIPTION_ID \
  6. --set azure.tenantID=$AZURE_TENANT_ID \
  7. --set azure.clientID=$AZURE_CLIENT_ID \
  8. --set azure.clientSecret=$AZURE_CLIENT_SECRET \
  9. --set tunnel=disabled \
  10. --set ipam.mode=azure \
  11. --set enableIPv4Masquerade=false \
  12. --set nodeinit.enabled=true

To install Cilium on Amazon Elastic Kubernetes Service (EKS), perform the following steps:

Default Configuration:

DatapathIPAMDatastore
Direct Routing (ENI)AWS ENIKubernetes CRD

For more information on AWS ENI mode, see AWS ENI.

Tip

If you want to chain Cilium on top of the AWS CNI, refer to the guide AWS VPC CNI plugin.

Requirements:

  • The EKS Managed Nodegroups must be properly tainted to ensure applications pods are properly managed by Cilium:

    • managedNodeGroups should be tainted with node.cilium.io/agent-not-ready=true:NoExecute to ensure application pods will only be scheduled once Cilium is ready to manage them. However, there are other options. Please make sure to read and understand the documentation page on taint effects and unmanaged pods.

      Below is an example on how to use ClusterConfig file to create the cluster:

      1. apiVersion: eksctl.io/v1alpha5
      2. kind: ClusterConfig
      3. ...
      4. managedNodeGroups:
      5. - name: ng-1
      6. ...
      7. # taint nodes so that application pods are
      8. # not scheduled/executed until Cilium is deployed.
      9. # Alternatively, see the note above regarding taint effects.
      10. taints:
      11. - key: "node.cilium.io/agent-not-ready"
      12. value: "true"
      13. effect: "NoExecute"

Limitations:

  • The AWS ENI integration of Cilium is currently only enabled for IPv4. If you want to use IPv6, use a datapath/IPAM mode other than ENI.

Delete VPC CNI (``aws-node`` DaemonSet)

Cilium will manage ENIs instead of VPC CNI, so the aws-node DaemonSet has to be deleted to prevent conflict behavior.

  1. kubectl -n kube-system delete daemonset aws-node

Install Cilium:

Deploy Cilium release via Helm:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace kube-system \
  3. --set eni.enabled=true \
  4. --set ipam.mode=eni \
  5. --set egressMasqueradeInterfaces=eth0 \
  6. --set tunnel=disabled

Note

This helm command sets eni.enabled=true and tunnel=disabled, meaning that Cilium will allocate a fully-routable AWS ENI IP address for each pod, similar to the behavior of the Amazon VPC CNI plugin.

This mode depends on a set of Required Privileges from the EC2 API.

Cilium can alternatively run in EKS using an overlay mode that gives pods non-VPC-routable IPs. This allows running more pods per Kubernetes worker node than the ENI limit, but means that pod connectivity to resources outside the cluster (e.g., VMs in the VPC or AWS managed services) is masqueraded (i.e., SNAT) by Cilium to use the VPC IP address of the Kubernetes worker node. To set up Cilium overlay mode, follow the steps below:

  1. Excluding the lines for eni.enabled=true, ipam.mode=eni and tunnel=disabled from the helm command will configure Cilium to use overlay routing mode (which is the helm default).

  2. Flush iptables rules added by VPC CNI

    1. iptables -t nat -F AWS-SNAT-CHAIN-0 \\
    2. && iptables -t nat -F AWS-SNAT-CHAIN-1 \\
    3. && iptables -t nat -F AWS-CONNMARK-CHAIN-0 \\
    4. && iptables -t nat -F AWS-CONNMARK-CHAIN-1

Some Linux distributions use a different interface naming convention. If you use masquerading with the option egressMasqueradeInterfaces=eth0, remember to replace eth0 with the proper interface name.

To install Cilium on OpenShift, perform the following steps:

Default Configuration:

DatapathIPAMDatastore
EncapsulationCluster PoolKubernetes CRD

Requirements:

  • OpenShift 4.x

Install Cilium:

Cilium is a Certified OpenShift CNI Plugin and is best installed when an OpenShift cluster is created using the OpenShift installer. Please refer to Installation on OpenShift OKD for more information.

To install Cilium on Rancher Kubernetes Engine (RKE), perform the following steps:

Note

If you are using RKE2, Cilium has been directly integrated. Please see Using Cilium in the RKE2 documentation. You can use either method.

Default Configuration:

DatapathIPAMDatastore
EncapsulationCluster PoolKubernetes CRD

Requirements:

  • Follow the RKE Installation Guide with the below change:

    From:

    1. network:
    2. options:
    3. flannel_backend_type: "vxlan"
    4. plugin: "canal"

    To:

    1. network:
    2. plugin: none

Install Cilium:

Install Cilium via helm install:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace $CILIUM_NAMESPACE

To install Cilium on k3s, perform the following steps:

Default Configuration:

DatapathIPAMDatastore
EncapsulationCluster PoolKubernetes CRD

Requirements:

  • Install your k3s cluster as you normally would but making sure to disable support for the default CNI plugin and the built-in network policy enforcer so you can install Cilium on top:
  1. curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC='--flannel-backend=none --disable-network-policy' sh -
  • For the Cilium CLI to access the cluster in successive steps you will need to use the kubeconfig file stored at /etc/rancher/k3s/k3s.yaml by setting the KUBECONFIG environment variable:
  1. export KUBECONFIG=/etc/rancher/k3s/k3s.yaml

Install Cilium:

  1. helm install cilium cilium/cilium --version 1.11.7 \
  2. --namespace $CILIUM_NAMESPACE

Restart unmanaged Pods

If you did not create a cluster with the nodes tainted with the taint node.cilium.io/agent-not-ready, then unmanaged pods need to be restarted manually. Restart all already running pods which are not running in host-networking mode to ensure that Cilium starts managing them. This is required to ensure that all pods which have been running before Cilium was deployed have network connectivity provided by Cilium and NetworkPolicy applies to them:

  1. $ kubectl get pods --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,HOSTNETWORK:.spec.hostNetwork --no-headers=true | grep '<none>' | awk '{print "-n "$1" "$2}' | xargs -L 1 -r kubectl delete pod
  2. pod "event-exporter-v0.2.3-f9c896d75-cbvcz" deleted
  3. pod "fluentd-gcp-scaler-69d79984cb-nfwwk" deleted
  4. pod "heapster-v1.6.0-beta.1-56d5d5d87f-qw8pv" deleted
  5. pod "kube-dns-5f8689dbc9-2nzft" deleted
  6. pod "kube-dns-5f8689dbc9-j7x5f" deleted
  7. pod "kube-dns-autoscaler-76fcd5f658-22r72" deleted
  8. pod "kube-state-metrics-7d9774bbd5-n6m5k" deleted
  9. pod "l7-default-backend-6f8697844f-d2rq2" deleted
  10. pod "metrics-server-v0.3.1-54699c9cc8-7l5w2" deleted

Note

This may error out on macOS due to -r being unsupported by xargs. In this case you can safely run this command without -r with the symptom that this will hang if there are no pods to restart. You can stop this with ctrl-c.

Validate the Installation

Cilium CLI

Manually

Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features (e.g. clustermesh, Hubble).

Linux

macOS

Other

  1. curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}
  2. sha256sum --check cilium-linux-amd64.tar.gz.sha256sum
  3. sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
  4. rm cilium-linux-amd64.tar.gz{,.sha256sum}
  1. curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-darwin-amd64.tar.gz{,.sha256sum}
  2. shasum -a 256 -c cilium-darwin-amd64.tar.gz.sha256sum
  3. sudo tar xzvfC cilium-darwin-amd64.tar.gz /usr/local/bin
  4. rm cilium-darwin-amd64.tar.gz{,.sha256sum}

See the full page of releases.

To validate that Cilium has been properly installed, you can run

  1. $ cilium status --wait
  2. /¯¯\
  3. /¯¯\__/¯¯\ Cilium: OK
  4. \__/¯¯\__/ Operator: OK
  5. /¯¯\__/¯¯\ Hubble: disabled
  6. \__/¯¯\__/ ClusterMesh: disabled
  7. \__/
  8. DaemonSet cilium Desired: 2, Ready: 2/2, Available: 2/2
  9. Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2
  10. Containers: cilium-operator Running: 2
  11. cilium Running: 2
  12. Image versions cilium quay.io/cilium/cilium:v1.9.5: 2
  13. cilium-operator quay.io/cilium/operator-generic:v1.9.5: 2

Run the following command to validate that your cluster has proper network connectivity:

  1. $ cilium connectivity test
  2. ℹ️ Monitor aggregation detected, will skip some flow validation steps
  3. [k8s-cluster] Creating namespace for connectivity check...
  4. (...)
  5. ---------------------------------------------------------------------------------------------------------------------
  6. 📋 Test Report
  7. ---------------------------------------------------------------------------------------------------------------------
  8. 69/69 tests successful (0 warnings)

Congratulations! You have a fully functional Kubernetes cluster with Cilium. 🎉

You can monitor as Cilium and all required components are being installed:

  1. $ kubectl -n kube-system get pods --watch
  2. NAME READY STATUS RESTARTS AGE
  3. cilium-operator-cb4578bc5-q52qk 0/1 Pending 0 8s
  4. cilium-s8w5m 0/1 PodInitializing 0 7s
  5. coredns-86c58d9df4-4g7dd 0/1 ContainerCreating 0 8m57s
  6. coredns-86c58d9df4-4l6b2 0/1 ContainerCreating 0 8m57s

It may take a couple of minutes for all components to come up:

  1. cilium-operator-cb4578bc5-q52qk 1/1 Running 0 4m13s
  2. cilium-s8w5m 1/1 Running 0 4m12s
  3. coredns-86c58d9df4-4g7dd 1/1 Running 0 13m
  4. coredns-86c58d9df4-4l6b2 1/1 Running 0 13m

You can deploy the “connectivity-check” to test connectivity between pods. It is recommended to create a separate namespace for this.

  1. kubectl create ns cilium-test

Deploy the check with:

  1. kubectl apply -n cilium-test -f https://raw.githubusercontent.com/cilium/cilium/v1.11/examples/kubernetes/connectivity-check/connectivity-check.yaml

It will deploy a series of deployments which will use various connectivity paths to connect to each other. Connectivity paths include with and without service load-balancing and various network policy combinations. The pod name indicates the connectivity variant and the readiness and liveness gate indicates success or failure of the test:

  1. $ kubectl get pods -n cilium-test
  2. NAME READY STATUS RESTARTS AGE
  3. echo-a-76c5d9bd76-q8d99 1/1 Running 0 66s
  4. echo-b-795c4b4f76-9wrrx 1/1 Running 0 66s
  5. echo-b-host-6b7fc94b7c-xtsff 1/1 Running 0 66s
  6. host-to-b-multi-node-clusterip-85476cd779-bpg4b 1/1 Running 0 66s
  7. host-to-b-multi-node-headless-dc6c44cb5-8jdz8 1/1 Running 0 65s
  8. pod-to-a-79546bc469-rl2qq 1/1 Running 0 66s
  9. pod-to-a-allowed-cnp-58b7f7fb8f-lkq7p 1/1 Running 0 66s
  10. pod-to-a-denied-cnp-6967cb6f7f-7h9fn 1/1 Running 0 66s
  11. pod-to-b-intra-node-nodeport-9b487cf89-6ptrt 1/1 Running 0 65s
  12. pod-to-b-multi-node-clusterip-7db5dfdcf7-jkjpw 1/1 Running 0 66s
  13. pod-to-b-multi-node-headless-7d44b85d69-mtscc 1/1 Running 0 66s
  14. pod-to-b-multi-node-nodeport-7ffc76db7c-rrw82 1/1 Running 0 65s
  15. pod-to-external-1111-d56f47579-d79dz 1/1 Running 0 66s
  16. pod-to-external-fqdn-allow-google-cnp-78986f4bcf-btjn7 1/1 Running 0 66s

Note

If you deploy the connectivity check to a single node cluster, pods that check multi-node functionalities will remain in the Pending state. This is expected since these pods need at least 2 nodes to be scheduled successfully.

Once done with the test, remove the cilium-test namespace:

  1. kubectl delete ns cilium-test

Next Steps