Understanding OKD update duration

OKD update duration varies based on the deployment topology. This page helps you understand the factors that affect update duration and estimate how long the cluster update takes in your environment.

Factors affecting update duration

The following factors can affect your cluster update duration:

  • The reboot of compute nodes to the new machine configuration by Machine Config Operator (MCO)

    • The value of MaxUnavailable in the machine config pool

    • The minimum number or percentages of replicas set in pod disruption budget (PDB)

  • The number of nodes in the cluster

  • The health of the cluster nodes

Cluster update phases

In OKD, the cluster update happens in two phases:

  • Cluster Version Operator (CVO) target update payload deployment

  • Machine Config Operator (MCO) node updates

Cluster Version Operator target update payload deployment

The Cluster Version Operator (CVO) retrieves the target update release image and applies to the cluster. All components which run as pods are updated during this phase, whereas the host components are updated by the Machine Config Operator (MCO). This process might take 60 to 120 minutes.

The CVO phase of the update does not restart the nodes.

Machine Config Operator node updates

The Machine Config Operator (MCO) applies a new machine configuration to each control plane and compute node. During this process, the MCO performs the following sequential actions on each node of the cluster:

  1. Cordon and drain all the nodes

  2. Update the operating system (OS)

  3. Reboot the nodes

  4. Uncordon all nodes and schedule workloads on the node

When a node is cordoned, workloads cannot be scheduled to it.

The time to complete this process depends on several factors including the node and infrastructure configuration. This process might take 5 or more minutes to complete per node.

In addition to MCO, you should consider the impact of the following parameters:

  • The control plane node update duration is predictable and oftentimes shorter than compute nodes, because the control plane workloads are tuned for graceful updates and quick drains.

  • You can update the compute nodes in parallel by setting the maxUnavailable field to greater than 1 in the Machine Config Pool (MCP). The MCO cordons the number of nodes specified in maxUnavailable and marks them unavailable for update.

  • When you increase maxUnavailable on the MCP, it can help the pool to update more quickly. However, if maxUnavailable is set too high, and several nodes are cordoned simultaneously, the pod disruption budget (PDB) guarded workloads could fail to drain because a schedulable node cannot be found to run the replicas. If you increase maxUnavailable for the MCP, ensure that you still have sufficient schedulable nodes to allow PDB guarded workloads to drain.

  • Before you begin the update, you must ensure that all the nodes are available. Any unavailable nodes can significantly impact the update duration because the node unavailability affects the maxUnavailable and pod disruption budgets.

    To check the status of nodes from the terminal, run the following command:

    1. $ oc get node

    Example Output

    1. NAME STATUS ROLES AGE VERSION
    2. ip-10-0-137-31.us-east-2.compute.internal Ready,SchedulingDisabled worker 12d v1.23.5+3afdacb
    3. ip-10-0-151-208.us-east-2.compute.internal Ready master 12d v1.23.5+3afdacb
    4. ip-10-0-176-138.us-east-2.compute.internal Ready master 12d v1.23.5+3afdacb
    5. ip-10-0-183-194.us-east-2.compute.internal Ready worker 12d v1.23.5+3afdacb
    6. ip-10-0-204-102.us-east-2.compute.internal Ready master 12d v1.23.5+3afdacb
    7. ip-10-0-207-224.us-east-2.compute.internal Ready worker 12d v1.23.5+3afdacb

    If the status of the node is NotReady or SchedulingDisabled, then the node is not available and this impacts the update duration.

    You can check the status of nodes from the Administrator perspective in the web console by expanding ComputeNode.

Additional resources

Example update duration of cluster Operators

A diagram displaying the periods during which cluster Operators update themselves during an OCP update

The previous diagram shows an example of the time that cluster Operators might take to update to their new versions. The example is based on a three-node AWS OVN cluster, which has a healthy compute MachineConfigPool and no workloads that take long to drain, updating from 4.13 to 4.14.

  • The specific update duration of a cluster and its Operators can vary based on several cluster characteristics, such as the target version, the amount of nodes, and the types of workloads scheduled to the nodes.

  • Some Operators, such as the Cluster Version Operator, update themselves in a short amount of time. These Operators have either been omitted from the diagram or are included in the broader group of Operators labeled “Other Operators in parallel”.

Each cluster Operator has characteristics that affect the time it takes to update itself. For instance, the Kube API Server Operator in this example took more than eleven minutes to update because kube-apiserver provides graceful termination support, meaning that existing, in-flight requests are allowed to complete gracefully. This might result in a longer shutdown of the kube-apiserver. In the case of this Operator, update speed is sacrificed to help prevent and limit disruptions to cluster functionality during an update.

Another characteristic that affects the update duration of an Operator is whether the Operator utilizes DaemonSets. The Network and DNS Operators utilize full-cluster DaemonSets, which can take time to roll out their version changes, and this is one of several reasons why these Operators might take longer to update themselves.

The update duration for some Operators is heavily dependent on characteristics of the cluster itself. For instance, the Machine Config Operator update applies machine configuration changes to each node in the cluster. A cluster with many nodes has a longer update duration for the Machine Config Operator compared to a cluster with fewer nodes.

Each cluster Operator is assigned a stage during which it can be updated. Operators within the same stage can update simultaneously, and Operators in a given stage cannot begin updating until all previous stages have been completed. For more information, see “Understanding how manifests are applied during an update” in the “Additional resources” section.

Additional resources

Estimating cluster update time

Historical update duration of similar clusters provides you the best estimate for the future cluster updates. However, if the historical data is not available, you can use the following convention to estimate your cluster update time:

  1. Cluster update time = CVO target update payload deployment time + (# node update iterations x MCO node update time)

A node update iteration consists of one or more nodes updated in parallel. The control plane nodes are always updated in parallel with the compute nodes. In addition, one or more compute nodes can be updated in parallel based on the maxUnavailable value.

For example, to estimate the update time, consider an OKD cluster with three control plane nodes and six compute nodes and each host takes about 5 minutes to reboot.

The time it takes to reboot a particular node varies significantly. In cloud instances, the reboot might take about 1 to 2 minutes, whereas in physical bare metal hosts the reboot might take more than 15 minutes.

Scenario-1

When you set maxUnavailable to 1 for both the control plane and compute nodes Machine Config Pool (MCP), then all the six compute nodes will update one after another in each iteration:

  1. Cluster update time = 60 + (6 x 5) = 90 minutes

Scenario-2

When you set maxUnavailable to 2 for the compute node MCP, then two compute nodes will update in parallel in each iteration. Therefore it takes total three iterations to update all the nodes.

  1. Cluster update time = 60 + (3 x 5) = 75 minutes

The default setting for maxUnavailable is 1 for all the MCPs in OKD. It is recommended that you do not change the maxUnavailable in the control plane MCP.

Additional resources