Release notes

The following table shows component versioning for Calico v3.25.

To select a different version, click Releases in the top navigation bar.

v3.25.0

Release archive with Kubernetes manifests, Docker images and binaries.

09 Jan 2023

eBPF Dataplane Stability: Connect Time Load Balancing (CTLB)

In certain scenarios, Calico would not update rapidly changing pods and IPs properly. We have added some large changes to the eBPF dataplane in order to ensure that connect time load balancing works in larger, rapidly changing environments.

Pull Requests:

  • ebpf: ipv4 and ipv6 code separated to different object files so the v6 code gets never loaded outside tests. calico #7093 (@tomastigera)
  • ebpf: CTLB resolves service when ipv4 is masked as ipv6. Commonly happens with grpc. calico #7087 (@tomastigera)
  • ebpf: we can apply the CTLB-turned-off workaround just to UDP calico #6783 (@tomastigera)
  • ebpf: host can accesses services without CTLB - gated feature calico #6527 (@tomastigera)

Improvements for clusters running at high scale

This release includes several enhancements to Typha, Calico’s caching API server proxy, targetting high-scale clusters. Typha acts as a proxy for the components within calico-node (such as Felix) when they watch resources in the Kubernetes API server. This both reduces load on the API server and (by filtering out updates that Calico doesn’t care about) it reduces load in calico-node.

  • Typha now supports graceful shutdown. Rather than disconnecting all clients and shutting down immediately, it will observe the terminationGracePeriodSeconds (which can now be set via the operator’s Installation resource). Typha will disconnect clients gradually over the graceful shutdown window. This reduces disruption by avoiding restarting many calico-node components at once.

    Since this is the first release with this feature, the benefit will only be seen when doing an upgrade from this release.

  • Typha now supports compression on its protocol; this gives a 5:1 reduction in bandwidth use and snapshot data size. Compression is automatically enabled when a supporting client (i.e. clients from this release onward) connects.

  • Typha now shares computed (and compressed) snapshots between clients that connect at approximately the same time. This significantly reduces CPU usage and the time to service all clients when many clients connect at once. In a cluster with 150k Pods, generating a snapshot can take 2-3s of CPU time so, if 100 clients connect at once, there can be a 200-300 second saving in CPU used, and a corresponding increase in throughput. Typha has prometheus metrics to monitor the size of snapshots (typha_snapshot_raw_bytes / typha_snapshot_compressed_bytes) and the number of snapshots that are reused for more than one client (typha_snapshots_reused).

  • Typha now exports a Prometheus metric (typha_cache_size) for the size of its internal cache.

  • Typha’s Prometheus metrics have been improved and split by client type. Previously the metrics would mix “high traffic” clients (such as Felix), with “low traffic” clients, making the metrics much less useful.

Bug fixes

General

  • Fix incorrect cleanup in the service policy index after having both ingress and egress rules that reference the same service, resulting in missed IP set updates after one rule was deactivated. calico #7148 (@fasaxc)
  • Fix panic in calico-node when invalid spoofed IP range provided on a pod. calico #7076 (@caseydavenport)
  • fixed felix docs for bpf config options calico #7065 (@tomastigera)
  • Fix missing nsswitch files in Typha causing localhost lookup fails calico #6971 (@wdoekes)
  • Fix that Calico would try to use the IPV6 VXLAN or Wireguard tunnel devices for its BGP connections. calico #6929 (@coutinhop)
  • Fix that Calico would try to use the VXLAN tunnel device for its BGP connections. calico #6902 (@caseydavenport)
  • Add missing Auto option for IptablesBackend FelixConfiguration field calico #6871 (@huiyizzz)
  • Fix an issue that caused annotations and labels to be overwritten during a calicoctl patch command calico #6791 (@mgleung)
  • Fixed SyncLabels validation for Kubernetes datastore. calico #6786 (@huiyizzz)
  • Fix issues with OCP installs using the wrong operator manifest. calico #6724 (@mgleung)
  • Fix bug in IPv6 router ID calculation on IPv6 single-stack clusters that resulted in invalid router IDs being calculated. Note that this change will result in new router IDs being used for some IPv6 single-stack nodes. calico #6674 (@ramanujadasu)
  • Fix that calicoctl ipam release could only release IPAM handles when running in etcd mode. calico #6650 (@fasaxc)
  • Fix issue in L3RouteResolver CIDRTrie which could result in crashes when the IPv6 trie had a node with a /63 prefix. calico #6532 (@coutinhop)
  • Fix nil error logged from kube-controllers health reporter calico #6513 (@caseydavenport)
  • Fix that kube-controllers health checks didn’t include a timeout on HTTP calls calico #6513 (@caseydavenport)
  • Set IPIPMode and VXLANMode to the default “Never” if they are empty strings in IPPools. calico #6498 (@coutinhop)
  • Fix that single-IP entries on BGPConfiguration LoadBalancerIPs were not advertised according to external traffic policy. calico #6282 (@mtryfoss)
  • fix: ErrorActionPreference must continue for kubectl commands Issue #6127 calico #6257 (@chrisjohnson00)

eBPF

  • ebpf: fix error setting accept_local - device may get stuck dirty calico #7071 (@tomastigera)
  • ebpf: no src fixup on host iface for traffic returning from pod to the nodeport tunnel calico #7039 (@tomastigera)
  • ebpf: XDP (notrack) policy debug output is removed/cleaned up when XDP program is removed (fix) calico #6994 (@tomastigera)
  • ebpf: fixes ifstate leak when devices go down calico #6946 (@tomastigera)

Windows

  • Fixed issue when Calico Windows hostprocess installation would fail to clean up a previous manual install of Calico Windows. calico #6952 (@coutinhop)
  • Fix issues with the windows node names in GCE calico #6470 (@lmm)

Wireguard

  • Limit rate of logging ‘Wireguard is not supported’ to fix log spam issues. calico #6534 (@coutinhop)

Other changes

General

  • Felix now supports overriding the timeouts of its internal readiness/liveness watchdog. This is useful for dealing with issues “in prod” without needing a new release. The timeouts have also been tuned to reduce false positives. calico #7061 (@fasaxc)
  • Typha now shares snapshots between clients that connect at roughly the same time. This dramatically reduces load when many clients connect at once. calico #7047 (@fasaxc)
  • By default, skip bridge interface created by docker network create command in IP auto-detection calico #7045 (@masap)
  • The Typha protocol now supports compression. This is enabled automatically if client and server both support it. calico #7043 (@fasaxc)
  • Add ignorable interfaces via the BGPConfiguration API calico #7006 (@huiyizzz)
  • Typha now supports graceful shut down, disconnecting calico-node pods at a configured rate instead of all at once. calico #6973 (@fasaxc)
  • Update installation documentation for AWS to include information regarding and links for CSI driver installation calico #6967 (@Josh-Tigera)
  • Update golang from 1.18.7 to 1.18.8 to avoid CVEs. calico #6961 (@Behnam-Shobiri)
  • By default, skip ‘podman’ interface in IP auto-detection calico #6950 (@OrvilleQ)
  • By default, skip ‘nodelocaldns’ interface in IP auto-detection calico #6942 (@cyclinder)
  • ebpf: faster program loading for workload endpoint - unused programs not loaded. calico #6933 (@tomastigera)
  • Remove problematic terminology from the codebase. calico #6912 (@fasaxc)
  • Update Istio support to include Istio v1.15.2 calico #6890 (@frozenprocess)
  • Add generalized TTL security mechanism (GTSM) via BGPPeer API calico #6862 (@Josh-Tigera)
  • Retain OpenSSL FIPS dependent files in calico-node image. calico #6852 (@hjiawei)
  • Disable VXLAN checksum offload by default for all kernels. If this was fixed, it has since been regressed. calico #6842 (@fasaxc)
  • Improve formatting of logged-out health reports from components such as Felix. calico #6833 (@fasaxc)
  • Update golang to 1.18.7 to avoid new CVEs. calico #6824 (@Behnam-Shobiri)
  • Updated documentation list of images to pull for deploying from private registry (now includes node-driver-registrar) calico #6812 (@Josh-Tigera)
  • Match full interface names in IP auto-detection default exclude list. calico #6760 (@neoaggelos)
  • Update multiple golang dependencies. calico #6719 (@Behnam-Shobiri)
  • Update the go version used to build the binaries from 1.18.5 to 1.18.6 calico #6717 (@Behnam-Shobiri)
  • Calico now uses a faster JSON parsing library; this reduces CPU load and improves start-up latency. calico #6705 (@fasaxc)
  • Reduce parsing overhead when parsing key/value pairs from Typha. calico #6703 (@fasaxc)
  • Many of Typha’s Prometheus metrics are now split by syncer (client) type, represented by a label “syncer” on the metrics. This prevents cross-talk where the syncers would all share the same metrics and the last writer to the metric would “win”. calico #6675 (@fasaxc)
  • The vxlanEnabled attribute from FelixConfiguration is now ignored for IPv6 VXLAN pools, allowing VXLAN to have IPv4 enabled independently from IPv6. calico #6671 (@muff1nman)
  • Typha now uses a B-tree for its internal cache, which allows it to export a Prometheus metric, typha_snapshot_size, that gives the total size of its current snapshot of the Calico datastore. calico #6666 (@fasaxc)
  • Use exponential backoff for kube-controllers health check timeout, retry sooner if failed. calico #6610 (@caseydavenport)
  • Bump K8S_VERSION and KUBECTL_VERSION to v1.24.3 in metadata.mk calico #6606 (@coutinhop)
  • Update Installation CRD to include new CSI changes introduced by recent operator API changes. calico #6596 (@Josh-Tigera)
  • Helm: imagePullSecrets now also applied to tigera-operator serviceaccount calico #6591 (@tamcore)
  • Retry kube-controllers initialization on failure calico #6566 (@tmjd)
  • Update the base images to alpine 3.16 for the flexvolume and CSI driver calico #6559 (@mgleung)
  • Windows quickstart install script creates calico service account token secret if missing calico #6464 (@lmm)
  • Updating the dependencies - to avoid indirect vulnerabilities (CVE) detection from scanners. calico #6452 (@Behnam-Shobiri)
  • added FeatureGates to Felix calico #6381 (@tomastigera)
  • eBPF: Add BPF counters to XDP programs, and also load XDP programs using Libbpf instead of iproute2. calico #6371 (@mazdakn)
  • The arm64 image of calico-kube-controllers now runs as non-root by default (similar to the amd64 image). calico #6346 (@ialidzhikov)

eBPF

  • ebpf: Include enPxxxxxx in the default BPFDataIfacePattern calico #7077 (@TrevorTaoARM)
  • ebpf: cleanup previously attached programs when BPFDataIfacePattern changes. calico #7008 (@tomastigera)
  • ebpf : BPFDisableLinuxConntrack added to FelixConfiguration resource. calico #6641 (@mazdakn)
  • ebpf: New felix config bpfL3IfacePattern allows to specify non calico L3 devices such as wireguard, vxlan. calico #6612 (@sridhartigera)

Windows

  • Update Windows NSSM version calico #6861 (@song-jiang)
  • windows: ensure calico-managed kubelet starts after the calico network has been initialized calico #6656 (@vitaliy-leschenko)

OpenStack

  • Calico for OpenStack: remove iptables programming by the DHCP agent that is no longer needed, and that was increasing the need for Felix to resync Calico’s iptables programming. Existing users will see issues - i.e. a VM failing to learn its IP address at boot time - if their VM OS is old enough to have unfixed DHCP client software. In that case the remedy is to update the VM OS. For example, in Tigera’s own testing, we updated from CirrOS 0.3.4 to CirrOS 0.6.0. calico #6857 (@tj90241)
  • Calico for OpenStack: prime the project (aka tenant) data cache on Neutron server startup calico #6839 (@tj90241)
  • Allow Calico to set MTU in OpenStack calico #6725 (@nelljerram)
ComponentVersion
calico/typhav3.25.0
calico/ctlv3.25.0
calico/nodev3.25.0
calico/cniv3.25.0
calico/apiserverv3.25.0
calico/kube-controllersv3.25.0
calico/flannel-migration-controllerv3.25.0
calico/windowsv3.25.0
networking-calicov3.25.0
docker.io/flannelcni/flannelv0.16.3
calico/dikastesv3.25.0
calico/pod2daemon-flexvolv3.25.0
calico/csiv3.25.0
calico/node-driver-registrarv3.25.0