Rolling Updates

Upgrading and modifying a Kubernetes cluster usually requires the replacement of cloud instances. In order to avoid loss of service and other disruption, kOps replaces cloud instances incrementally with a rolling update.

Rolling updates are performed using the kops rolling-update cluster command.

Instance selection

Cloud instances are chosen to be updated (replaced) if at least one of the following is true:

  • The instance was created with a specification that is older than that generated by the last kops update cluster.
  • The instance was detached for surging by a previous (failed or interrupted) rolling update.
  • The node has a kops.k8s.io/needs-update annotation.
  • The --force flag was given to the kops rolling-update cluster command.

Order of instance groups

A rolling update will update instances from one instance group at a time. First, it will update bastion instance groups. Next, it will update master instance groups, then apiserver instance groups. Finally, it will update node instance groups. Within an instance group role it will update instance groups in alphabetical order.

A rolling update may be restricted to instance groups of particular roles (“Bastion”, “Master”, “APIServer”, and/or “Node”) with the --instance-group-roles flag. A rolling update may be restricted to particular instance groups with the --instance-group flag.

Updating an instance group

The first thing rolling update will do when updating an instance group is validate the cluster, as for the kops validate cluster command. If the cluster fails validation at this time then the entire rolling update will stop with an error.

Next, rolling update will apply a PreferNoSchedule (soft) taint to the instance group’s nodes that have been chosen to be updated. This will prevent new pods, including replacements for evicted pods, from being scheduled on the old nodes unless there is no other place to schedule them.

This validation and tainting will not be performed if either of the following is true:

  • The instance group is of role “Bastion”.
  • The --cloudonly flag was given to the kops rolling-update cluster command.

Finally, rolling update will replace the instance group’s chosen nodes, respecting the limits configured in that group’s rolling update strategy.

Updating an instance

When being updated, a node is first cordoned to prevent any new pods from being scheduled on it. The cordoning also causes some cloud provider load balancers to remove the node from the set of available destinations. Next, the node is drained, voluntarily evicting all pods not managed by a DaemonSet. This eviction respects any pod disruption budgets.

After all such pods have been evicted, rolling update will wait 5 seconds to allow TCP connections to those pods to close. The amount of time to wait may be changed with the --post-drain-delay flag.

Instances will not be cordoned or drained if at least one of the following is true:

  • They are bastions.
  • They were not registered as nodes.
  • The --cloudonly flag was given to the kops rolling-update cluster command.

Rolling update will then terminate the instance. Unless the instance had been detached for surging, this will cause the cloud provider to create a new instance with the current specification.

Rolling update then waits for 15 seconds to allow the Kubernetes APIserver to notice the termination. The amount of time to wait may be changed with the --bastion-interval, --master-interval, and/or --node-interval flags.

Unless the --cloudonly flag was given, rolling update then waits until the cluster validates successfully. This is done in order to ensure the replacement instance is working before rolling update proceeds to update another instance.

Configurable rolling update strategies

The behavior of rolling update within an instance group may be configured through the rollingUpdate field of the group’s InstanceGroupSpec.

Cluster-wide defaults may be configured through the rollingUpdate field of the ClusterSpec.

maxUnavailable

The maxUnavailable field specifies the maximum number of nodes that can be unavailable during the rolling update. Increasing this setting allows more instances to be updated in parallel.

The value can be an absolute number (for example 5) or a percentage of the nodes in the group (for example “10%”). The absolute number is calculated from a percentage by rounding down.

For example, to permit two instances to be updated in parallel:

  1. spec:
  2. rollingUpdate:
  3. maxUnavailable: 2

This field defaults to 1 if the maxSurge field is 0, otherwise it defaults to 0.

If there are no instances that have been created with the current specification, then a rolling update will start with updating a single instance. It does this to limit the damage in case the new specification results in non-working nodes.

maxSurge

Surging is temporarily increasing the number of instances in an instance group during a rolling update. Instead of first draining and terminating an instance and then creating a new one, it effectively first creates a new instance and then drains and terminates the old one.

Surging is implemented by “detaching” instances, making them not count toward the desired number of instances in the instance group. This causes the cloud provider to create new instances in order to satisfy the group’s desired number. The detached instances are drained and terminated last; when they are terminated the cloud provider does not replace them.

The maxSurge is the maximum number of extra instances that can be created during the update. Increasing this setting allows more instances to be updated in parallel. Rolling update will not create more new instances than the number of instances selected for update.

The value can be an absolute number (for example 5) or a percentage of the nodes in the group (for example “10%”). The absolute number is calculated from a percentage by rounding up.

Masters are unable to surge. Any cluster-wide default setting will be ignored for instance groups of role “Master”. Setting this value on the InstanceGroupSpec for an instance group of role “Master” will result in an API validation error.

For example, to add a maximum of two additional instances to the group during a rolling update, allowing two to be updated in parallel:

  1. spec:
  2. rollingUpdate:
  3. maxSurge: 2

If there are no instances that have been created with the current specification, then rolling update will start with creating a single new instance. It does this to limit the damage in case the new specification results in non-working nodes. Once the new instance validates successfully, it then creates any remaining surge instances.

Disabling rolling updates

Rolling updates may be partially disabled for an instance group by setting the drainAndTerminate field to false.

  1. spec:
  2. rollingUpdate:
  3. drainAndTerminate: false

Nodes needing update will still be tainted. If maxSurge is nonzero, up to that many extra nodes will still be created.