Updating a cluster using the CLI

You can update, or upgrade, an OKD cluster within a minor version by using the OpenShift CLI (oc). You can also update a cluster between minor versions by following the same instructions.

Prerequisites

  • Have access to the cluster as a user with admin privileges. See Using RBAC to define and apply permissions.

  • Have a recent etcd backup in case your update fails and you must restore your cluster to a previous state.

  • Support for Fedora7 workers is removed in OKD 4.12. You must replace Fedora7 workers with Fedora8 or FCOS workers before upgrading to OKD 4.12. Red Hat does not support in-place Fedora7 to Fedora8 updates for Fedora workers; those hosts must be replaced with a clean operating system install.

  • Ensure all Operators previously installed through Operator Lifecycle Manager (OLM) are updated to their latest version in their latest channel. Updating the Operators ensures they have a valid update path when the default OperatorHub catalogs switch from the current minor version to the next during a cluster update. See Updating installed Operators for more information.

  • Ensure that all machine config pools (MCPs) are running and not paused. Nodes associated with a paused MCP are skipped during the update process. You can pause the MCPs if you are performing a canary rollout update strategy.

  • If your cluster uses manually maintained credentials, ensure that the Cloud Credential Operator (CCO) is in an upgradeable state. For more information, see Upgrading clusters with manually maintained credentials.

  • If your cluster uses manually maintained credentials with the AWS Secure Token Service (STS), obtain a copy of the ccoctl utility from the release image being updated to and use it to process any updated credentials. For more information, see Upgrading an OpenShift Container Platform cluster configured for manual mode with STS.

  • Ensure that you address all Upgradeable=False conditions so the cluster allows an update to the next minor version. An alert displays at the top of the Cluster Settings page when you have one or more cluster Operators that cannot be upgraded. You can still update to the next available patch update for the minor release you are currently on.

  • Review the list of APIs that were removed in Kubernetes 1.25, migrate any affected components to use the new API version, and provide the administrator acknowledgment. For more information, see Preparing to update to OKD 4.12.

  • If you run an Operator or you have configured any application with the pod disruption budget, you might experience an interruption during the upgrade process. If minAvailable is set to 1 in PodDisruptionBudget, the nodes are drained to apply pending machine configs which might block the eviction process. If several nodes are rebooted, all the pods might run on only one node, and the PodDisruptionBudget field can prevent the node drain.

  • When an update is failing to complete, the Cluster Version Operator (CVO) reports the status of any blocking components while attempting to reconcile the update. Rolling your cluster back to a previous version is not supported. If your update is failing to complete, contact Red Hat support.

  • Using the unsupportedConfigOverrides section to modify the configuration of an Operator is unsupported and might block cluster updates. You must remove this setting before you can update your cluster.

Additional resources

Upgrading clusters with manually maintained credentials

The Cloud Credential Operator (CCO) Upgradable status for a cluster with manually maintained credentials is False by default.

  • For minor releases, for example, from 4.11 to 4.12, this status prevents you from upgrading until you have addressed any updated permissions and annotated the CloudCredential resource to indicate that the permissions are updated as needed for the next version. This annotation changes the Upgradable status to True.

  • For z-stream releases, for example, from 4.12.0 to 4.12.1, no permissions are added or changed, so the upgrade is not blocked.

Before upgrading a cluster with manually maintained credentials, you must create any new credentials for the release image that you are upgrading to. Additionally, you must review the required permissions for existing credentials and accommodate any new permissions requirements in the new release for those components.

Procedure

  1. Extract and examine the CredentialsRequest custom resource for the new release.

    The “Manually creating IAM” section of the installation content for your cloud provider explains how to obtain and use the credentials required for your cloud.

  2. Update the manually maintained credentials on your cluster:

    • Create new secrets for any CredentialsRequest custom resources that are added by the new release image.

    • If the CredentialsRequest custom resources for any existing credentials that are stored in secrets have changed their permissions requirements, update the permissions as required.

    If your cluster uses cluster capabilities to disable one or more optional components, delete the CredentialsRequest custom resources for any disabled components.

  3. When all of the secrets are correct for the new release, indicate that the cluster is ready to upgrade:

    1. Log in to the OKD CLI as a user with the cluster-admin role.

    2. Edit the CloudCredential resource to add an upgradeable-to annotation within the metadata field:

      1. $ oc edit cloudcredential cluster

      Text to add

      1. ...
      2. metadata:
      3. annotations:
      4. cloudcredential.openshift.io/upgradeable-to: <version_number>
      5. ...

      Where <version_number> is the version you are upgrading to, in the format x.y.z. For example, 4.8.2 for OKD 4.8.2.

      It may take several minutes after adding the annotation for the upgradeable status to change.

Verification

  1. In the Administrator perspective of the web console, navigate to AdministrationCluster Settings.

  2. To view the CCO status details, click cloud-credential in the Cluster Operators list.

    1. If the Upgradeable status in the Conditions section is False, verify that the upgradeable-to annotation is free of typographical errors. When the Upgradeable status in the Conditions section is True, you can begin the OKD upgrade.

Additional resources

Pausing a MachineHealthCheck resource

During the upgrade process, nodes in the cluster might become temporarily unavailable. In the case of worker nodes, the machine health check might identify such nodes as unhealthy and reboot them. To avoid rebooting such nodes, pause all the MachineHealthCheck resources before updating the cluster.

Prerequisites

  • Install the OpenShift CLI (oc).

Procedure

  1. To list all the available MachineHealthCheck resources that you want to pause, run the following command:

    1. $ oc get machinehealthcheck -n openshift-machine-api
  2. To pause the machine health checks, add the cluster.x-k8s.io/paused="" annotation to the MachineHealthCheck resource. Run the following command:

    1. $ oc -n openshift-machine-api annotate mhc <mhc-name> cluster.x-k8s.io/paused=""

    The annotated MachineHealthCheck resource resembles the following YAML file:

    1. apiVersion: machine.openshift.io/v1beta1
    2. kind: MachineHealthCheck
    3. metadata:
    4. name: example
    5. namespace: openshift-machine-api
    6. annotations:
    7. cluster.x-k8s.io/paused: ""
    8. spec:
    9. selector:
    10. matchLabels:
    11. role: worker
    12. unhealthyConditions:
    13. - type: "Ready"
    14. status: "Unknown"
    15. timeout: "300s"
    16. - type: "Ready"
    17. status: "False"
    18. timeout: "300s"
    19. maxUnhealthy: "40%"
    20. status:
    21. currentHealthy: 5
    22. expectedMachines: 5

    Resume the machine health checks after updating the cluster. To resume the check, remove the pause annotation from the MachineHealthCheck resource by running the following command:

    1. $ oc -n openshift-machine-api annotate mhc <mhc-name> cluster.x-k8s.io/paused-

About updating single node OKD

You can update, or upgrade, a single-node OKD cluster by using either the console or CLI.

However, note the following limitations:

  • The prerequisite to pause the MachineHealthCheck resources is not required because there is no other node to perform the health check.

  • Restoring a single-node OKD cluster using an etcd backup is not officially supported. However, it is good practice to perform the etcd backup in case your upgrade fails. If your control plane is healthy, you might be able to restore your cluster to a previous state by using the backup.

  • Updating a single-node OKD cluster requires downtime and can include an automatic reboot. The amount of downtime depends on the update payload, as described in the following scenarios:

    • If the update payload contains an operating system update, which requires a reboot, the downtime is significant and impacts cluster management and user workloads.

    • If the update contains machine configuration changes that do not require a reboot, the downtime is less, and the impact on the cluster management and user workloads is lessened. In this case, the node draining step is skipped with single-node OKD because there is no other node in the cluster to reschedule the workloads to.

    • If the update payload does not contain an operating system update or machine configuration changes, a short API outage occurs and resolves quickly.

There are conditions, such as bugs in an updated package, that can cause the single node to not restart after a reboot. In this case, the update does not rollback automatically.

Additional resources

Updating a cluster by using the CLI

If updates are available, you can update your cluster by using the OpenShift CLI (oc).

You can find information about available OKD advisories and updates in the errata section of the Customer Portal.

Prerequisites

  • Install the OpenShift CLI (oc) that matches the version for your updated version.

  • Log in to the cluster as user with cluster-admin privileges.

  • Install the jq package.

  • Pause all MachineHealthCheck resources.

Procedure

  1. Ensure that your cluster is available:

    1. $ oc get clusterversion

    Example output

    1. NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
    2. version 4.9.23 True False 158m Cluster version is 4.9.23
  2. View the available updates and note the version number of the update that you want to apply:

    1. $ oc adm upgrade

    Example output

    1. Cluster version is 4.9.23
    2. Upstream is unset, so the cluster will use an appropriate default.
    3. Channel: stable-4.10 (available channels: candidate-4.10, candidate-4.9, fast-4.10, fast-4.9, stable-4.10, stable-4.9)
    4. Recommended updates:
    5. VERSION IMAGE
    6. 4.9.24 quay.io/openshift-release-dev/ocp-release@sha256:6a899c54dda6b844bb12a247e324a0f6cde367e880b73ba110c056df6d018032
    7. 4.9.25 quay.io/openshift-release-dev/ocp-release@sha256:2eafde815e543b92f70839972f585cc52aa7c37aa72d5f3c8bc886b0fd45707a
    8. 4.9.26 quay.io/openshift-release-dev/ocp-release@sha256:3ccd09dd08c303f27a543351f787d09b83979cd31cf0b4c6ff56cd68814ef6c8
    9. 4.9.27 quay.io/openshift-release-dev/ocp-release@sha256:1c7db78eec0cf05df2cead44f69c0e4b2c3234d5635c88a41e1b922c3bedae16
    10. 4.9.28 quay.io/openshift-release-dev/ocp-release@sha256:4084d94969b186e20189649b5affba7da59f7d1943e4e5bc7ef78b981eafb7a8
    11. 4.9.29 quay.io/openshift-release-dev/ocp-release@sha256:b04ca01d116f0134a102a57f86c67e5b1a3b5da1c4a580af91d521b8fa0aa6ec
    12. 4.9.31 quay.io/openshift-release-dev/ocp-release@sha256:2a28b8ebb53d67dd80594421c39e36d9896b1e65cb54af81fbb86ea9ac3bf2d7
    13. 4.9.32 quay.io/openshift-release-dev/ocp-release@sha256:ecdb6d0df547b857eaf0edb5574ddd64ca6d9aff1fa61fd1ac6fb641203bedfa
    14. 4.10.3 quay.io/openshift-release-dev/ocp-release@sha256:7ffe4cd612be27e355a640e5eec5cd8f923c1400d969fd590f806cffdaabcc56
    15. 4.10.4 quay.io/openshift-release-dev/ocp-release@sha256:9f9c3aaca64f62af992bae5de1e984571c8b812f598b74c84dc630b064389fb7
    16. 4.10.5 quay.io/openshift-release-dev/ocp-release@sha256:ee6a9c7a11f883e90489229f6c6dc78b434af12f5646f4f9411d73a98969f02a
    17. 4.10.6 quay.io/openshift-release-dev/ocp-release@sha256:88b394e633e09dc23aa1f1a61ededd8e52478edf34b51a7dbbb21d9abde2511a
    18. 4.10.8 quay.io/openshift-release-dev/ocp-release@sha256:0696e249622b4d07d8f4501504b6c568ed6ba92416176a01a12b7f1882707117
    19. 4.10.9 quay.io/openshift-release-dev/ocp-release@sha256:39f360002b9b5c730d1167879ad6437352d51e72acc9fe80add3ec2a0d20400d
    20. 4.10.10 quay.io/openshift-release-dev/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3
    21. 4.10.11 quay.io/openshift-release-dev/ocp-release@sha256:0dc1a4b4d9ea7954987f63e506474a4f0dc55e5f1ea5c1f6f1179e2c09eaffda
    22. 4.10.12 quay.io/openshift-release-dev/ocp-release@sha256:f77f4f75c1e1a4ddd0a0355f298a834db3473fd9ca473235013e9419d1df16db
    23. 4.10.13 quay.io/openshift-release-dev/ocp-release@sha256:4f516616baed3cf84585e753359f7ef2153ae139c2e80e0191902fbd073c4143
  3. Based on your organization requirements, set the upgrade channel to stable-4.12, fast-4.12, or eus-4.12:

    1. $ oc adm upgrade channel <channel>

    For example, to set the channel to stable-4.12:

    1. $ oc adm upgrade channel stable-4.12

    For production clusters, you must subscribe to a stable-, eus-, or fast-* channel.

  4. Apply an update:

    • To update to the latest version:

      1. $ oc adm upgrade --to-latest=true (1)
    • To update to a specific version:

      1. $ oc adm upgrade --to=<version> (1)
      1<version> is the update version that you obtained from the output of the oc adm upgrade command.
  5. Review the status of the Cluster Version Operator:

    1. $ oc get clusterversion -o json|jq ".items[0].spec"

    Example output

    1. {
    2. "channel": "stable-4.12",
    3. "clusterID": "990f7ab8-109b-4c95-8480-2bd1deec55ff",
    4. "desiredUpdate": {
    5. "force": false,
    6. "image": "quay.io/openshift-release-dev/ocp-release@sha256:9c5f0df8b192a0d7b46cd5f6a4da2289c155fd5302dec7954f8f06c878160b8b",
    7. "version": "<version>" (1)
    8. }
    9. }
    1If the version number in the desiredUpdate stanza matches the value that you specified, the update is in progress.
  6. Review the cluster version status history to monitor the status of the update. It might take some time for all the objects to finish updating.

    1. $ oc get clusterversion -o json|jq ".items[0].status.history"

    Example output

    1. [
    2. {
    3. "completionTime": null,
    4. "image": "quay.io/openshift-release-dev/ocp-release@sha256:b8fa13e09d869089fc5957c32b02b7d3792a0b6f36693432acc0409615ab23b7",
    5. "startedTime": "2021-01-28T20:30:50Z",
    6. "state": "Partial",
    7. "verified": true,
    8. "version": "4.10.13"
    9. },
    10. {
    11. "completionTime": "2021-01-28T20:30:50Z",
    12. "image": "quay.io/openshift-release-dev/ocp-release@sha256:b8fa13e09d869089fc5957c32b02b7d3792a0b6f36693432acc0409615ab23b7",
    13. "startedTime": "2021-01-28T17:38:10Z",
    14. "state": "Completed",
    15. "verified": false,
    16. "version": "4.9.23"
    17. }
    18. ]

    The history contains a list of the most recent versions applied to the cluster. This value is updated when the CVO applies an update. The list is ordered by date, where the newest update is first in the list. Updates in the history have state Completed if the rollout completed and Partial if the update failed or did not complete.

  7. After the update completes, you can confirm that the cluster version has updated to the new version:

    1. $ oc get clusterversion

    Example output

    1. NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
    2. version 4.12 True False 2m Cluster version is 4.12
  8. If you are upgrading your cluster to the next minor version, like version 4.y to 4.(y+1), it is recommended to confirm your nodes are updated before deploying workloads that rely on a new feature:

    1. $ oc get nodes

    Example output

    1. NAME STATUS ROLES AGE VERSION
    2. ip-10-0-168-251.ec2.internal Ready master 82m v1.25.0
    3. ip-10-0-170-223.ec2.internal Ready master 82m v1.25.0
    4. ip-10-0-179-95.ec2.internal Ready worker 70m v1.25.0
    5. ip-10-0-182-134.ec2.internal Ready worker 70m v1.25.0
    6. ip-10-0-211-16.ec2.internal Ready master 82m v1.25.0
    7. ip-10-0-250-100.ec2.internal Ready worker 69m v1.25.0

Additional resources

Updating along a conditional upgrade path

You can update along a recommended conditional upgrade path using the web console or the OpenShift CLI (oc). When a conditional update is not recommended for your cluster, you can update along a conditional upgrade path using the OpenShift CLI (oc) 4.10 or later.

Procedure

  1. To view the description of the update when it is not recommended because a risk might apply, run the following command:

    1. $ oc adm upgrade --include-not-recommended
  2. If the cluster administrator evaluates the potential known risks and decides it is acceptable for the current cluster, then the administrator can waive the safety guards and proceed the update by running the following command:

    1. $ oc adm upgrade --allow-not-recommended --to <version> (1)
    1<version> is the supported but not recommended update version that you obtained from the output of the previous command.

Additional resources

Changing the update server by using the CLI

Changing the update server is optional. If you have an OpenShift Update Service (OSUS) installed and configured locally, you must set the URL for the server as the upstream to use the local server during updates. The default value for upstream is https://api.openshift.com/api/upgrades_info/v1/graph.

Procedure

  • Change the upstream parameter value in the cluster version:

    1. $ oc patch clusterversion/version --patch '{"spec":{"upstream":"<update-server-url>"}}' --type=merge

    The <update-server-url> variable specifies the URL for the update server.

    Example output

    1. clusterversion.config.openshift.io/version patched