Expanding the cluster

After deploying an installer-provisioned OKD cluster, you can use the following procedures to expand the number of worker nodes. Ensure that each prospective worker node meets the prerequisites.

Expanding the cluster using RedFish Virtual Media involves meeting minimum firmware requirements. See Firmware requirements for installing with virtual media in the Prerequisites section for additional details when expanding the cluster using RedFish Virtual Media.

Preparing the bare metal node

Expanding the cluster requires a DHCP server. Each node must have a DHCP reservation.

Reserving IP addresses so they become static IP addresses

Some administrators prefer to use static IP addresses so that each node’s IP address remains constant in the absence of a DHCP server. To use static IP addresses in the OKD cluster, reserve the IP addresses in the DHCP server with an infinite lease. After the installer provisions the node successfully, the dispatcher script will check the node’s network configuration. If the dispatcher script finds that the network configuration contains a DHCP infinite lease, it will recreate the connection as a static IP connection using the IP address from the DHCP infinite lease. NICs without DHCP infinite leases will remain unmodified.

Preparing the bare metal node requires executing the following procedure from the provisioner node.

Procedure

  1. Get the oc binary, if needed. It should already exist on the provisioner node.

    1. [kni@provisioner ~]$ curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$VERSION/openshift-client-linux-$VERSION.tar.gz | tar zxvf - oc
    1. [kni@provisioner ~]$ sudo cp oc /usr/local/bin
  2. Power off the bare metal node via the baseboard management controller and ensure it is off.

  3. Retrieve the user name and password of the bare metal node’s baseboard management controller. Then, create base64 strings from the user name and password. In the following example, the user name is root and the password is calvin.

    1. [kni@provisioner ~]$ echo -ne "root" | base64
    1. [kni@provisioner ~]$ echo -ne "calvin" | base64
  4. Create a configuration file for the bare metal node.

    1. [kni@provisioner ~]$ vim bmh.yaml
    1. ---
    2. apiVersion: v1
    3. kind: Secret
    4. metadata:
    5. name: openshift-worker-<num>-bmc-secret
    6. type: Opaque
    7. data:
    8. username: <base64-of-uid>
    9. password: <base64-of-pwd>
    10. ---
    11. apiVersion: metal3.io/v1alpha1
    12. kind: BareMetalHost
    13. metadata:
    14. name: openshift-worker-<num>
    15. spec:
    16. online: true
    17. bootMACAddress: <NIC1-mac-address>
    18. bmc:
    19. address: <protocol>://<bmc-ip>
    20. credentialsName: openshift-worker-<num>-bmc-secret

    Replace <num> for the worker number of the bare metal node in the two name fields and the credentialsName field. Replace <base64-of-uid> with the base64 string of the user name. Replace <base64-of-pwd> with the base64 string of the password. Replace <NIC1-mac-address> with the MAC address of the bare metal node’s first NIC.

    See the BMC addressing section for additional BMC configuration options. Replace <protocol> with the BMC protocol, such as IPMI, RedFish, or others. Replace <bmc-ip> with the IP address of the bare metal node’s baseboard management controller.

    If the MAC address of an existing bare metal node matches the MAC address of a bare metal host that you are attempting to provision, then the Ironic installation will fail. If the host enrollment, inspection, cleaning, or other Ironic steps fail, the Bare Metal Operator retries the installation continuously. See Diagnosing a host duplicate MAC address for more information.

  5. Create the bare metal node.

    1. [kni@provisioner ~]$ oc -n openshift-machine-api create -f bmh.yaml
    1. secret/openshift-worker-<num>-bmc-secret created
    2. baremetalhost.metal3.io/openshift-worker-<num> created

    Where <num> will be the worker number.

  6. Power up and inspect the bare metal node.

    1. [kni@provisioner ~]$ oc -n openshift-machine-api get bmh openshift-worker-<num>

    Where <num> is the worker node number.

    1. NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
    2. openshift-worker-<num> OK ready ipmi://<out-of-band-ip> unknown true

Preparing to deploy with Virtual Media on the baremetal network

If the provisioning network is enabled and you want to expand the cluster using Virtual Media on the baremetal network, use the following procedure.

Prerequisites

  • There is an existing cluster with a baremetal network and a provisioning network.

Procedure

  1. Edit the provisioning custom resource (CR) to enable deploying with Virtual Media on the baremetal network:

    1. oc edit provisioning
    1. apiVersion: metal3.io/v1alpha1
    2. kind: Provisioning
    3. metadata:
    4. creationTimestamp: "2021-08-05T18:51:50Z"
    5. finalizers:
    6. - provisioning.metal3.io
    7. generation: 8
    8. name: provisioning-configuration
    9. resourceVersion: "551591"
    10. uid: f76e956f-24c6-4361-aa5b-feaf72c5b526
    11. spec:
    12. preProvisioningOSDownloadURLs: {}
    13. provisioningDHCPRange: 172.22.0.10,172.22.0.254
    14. provisioningIP: 172.22.0.3
    15. provisioningInterface: enp1s0
    16. provisioningNetwork: Managed
    17. provisioningNetworkCIDR: 172.22.0.0/24
    18. provisioningOSDownloadURL: http://192.168.111.1/images/rhcos-<version>.x86_64.qcow2.gz?sha256=<sha256>
    19. virtualMediaViaExternalNetwork: true (1)
    20. status:
    21. generations:
    22. - group: apps
    23. hash: ""
    24. lastGeneration: 7
    25. name: metal3
    26. namespace: openshift-machine-api
    27. resource: deployments
    28. - group: apps
    29. hash: ""
    30. lastGeneration: 1
    31. name: metal3-image-cache
    32. namespace: openshift-machine-api
    33. resource: daemonsets
    34. observedGeneration: 8
    35. readyReplicas: 0
    1Add virtualMediaViaExternalNetwork: true to the provisioning CR.
  2. Edit the machineset to use the API VIP address:

    1. oc edit machineset
    1. apiVersion: machine.openshift.io/v1beta1
    2. kind: MachineSet
    3. metadata:
    4. creationTimestamp: "2021-08-05T18:51:52Z"
    5. generation: 11
    6. labels:
    7. machine.openshift.io/cluster-api-cluster: ostest-hwmdt
    8. machine.openshift.io/cluster-api-machine-role: worker
    9. machine.openshift.io/cluster-api-machine-type: worker
    10. name: ostest-hwmdt-worker-0
    11. namespace: openshift-machine-api
    12. resourceVersion: "551513"
    13. uid: fad1c6e0-b9da-4d4a-8d73-286f78788931
    14. spec:
    15. replicas: 2
    16. selector:
    17. matchLabels:
    18. machine.openshift.io/cluster-api-cluster: ostest-hwmdt
    19. machine.openshift.io/cluster-api-machineset: ostest-hwmdt-worker-0
    20. template:
    21. metadata:
    22. labels:
    23. machine.openshift.io/cluster-api-cluster: ostest-hwmdt
    24. machine.openshift.io/cluster-api-machine-role: worker
    25. machine.openshift.io/cluster-api-machine-type: worker
    26. machine.openshift.io/cluster-api-machineset: ostest-hwmdt-worker-0
    27. spec:
    28. metadata: {}
    29. providerSpec:
    30. value:
    31. apiVersion: baremetal.cluster.k8s.io/v1alpha1
    32. hostSelector: {}
    33. image:
    34. checksum: http:/172.22.0.3:6181/images/rhcos-<version>.x86_64.qcow2.<md5sum> (1)
    35. url: http://172.22.0.3:6181/images/rhcos-<version>.x86_64.qcow2 (2)
    36. kind: BareMetalMachineProviderSpec
    37. metadata:
    38. creationTimestamp: null
    39. userData:
    40. name: worker-user-data
    41. status:
    42. availableReplicas: 2
    43. fullyLabeledReplicas: 2
    44. observedGeneration: 11
    45. readyReplicas: 2
    46. replicas: 2
    1Edit the checksum URL to use the API VIP address.
    2Edit the url URL to use the API VIP address.

Diagnosing a duplicate MAC address when provisioning a new host in the cluster

If the MAC address of an existing bare-metal node in the cluster matches the MAC address of a bare-metal host you are attempting to add to the cluster, the Bare Metal Operator associates the host with the existing node. If the host enrollment, inspection, cleaning, or other Ironic steps fail, the Bare Metal Operator retries the installation continuously. A registration error is displayed for the failed bare-metal host.

You can diagnose a duplicate MAC address by examining the bare-metal hosts that are running in the openshift-machine-api namespace.

Prerequisites

  • Install an OKD cluster on bare metal.

  • Install the OKD CLI oc.

  • Log in as a user with cluster-admin privileges.

Procedure

To determine whether a bare-metal host that fails provisioning has the same MAC address as an existing node, do the following:

  1. Get the bare-metal hosts running in the openshift-machine-api namespace:

    1. $ oc get bmh -n openshift-machine-api

    Example output

    1. NAME STATUS PROVISIONING STATUS CONSUMER
    2. openshift-master-0 OK externally provisioned openshift-zpwpq-master-0
    3. openshift-master-1 OK externally provisioned openshift-zpwpq-master-1
    4. openshift-master-2 OK externally provisioned openshift-zpwpq-master-2
    5. openshift-worker-0 OK provisioned openshift-zpwpq-worker-0-lv84n
    6. openshift-worker-1 OK provisioned openshift-zpwpq-worker-0-zd8lm
    7. openshift-worker-2 error registering
  2. To see more detailed information about the status of the failing host, run the following command replacing <bare_metal_host_name> with the name of the host:

    1. $ oc get -n openshift-machine-api bmh <bare_metal_host_name> -o yaml

    Example output

    1. ...
    2. status:
    3. errorCount: 12
    4. errorMessage: MAC address b4:96:91:1d:7c:20 conflicts with existing node openshift-worker-1
    5. errorType: registration error
    6. ...

Provisioning the bare metal node

Provisioning the bare metal node requires executing the following procedure from the provisioner node.

Procedure

  1. Ensure the PROVISIONING STATUS is ready before provisioning the bare metal node.

    1. $ oc -n openshift-machine-api get bmh openshift-worker-<num>

    Where <num> is the worker node number.

    1. NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
    2. openshift-worker-<num> OK ready ipmi://<out-of-band-ip> unknown true
  2. Get a count of the number of worker nodes.

    1. $ oc get nodes
    1. NAME STATUS ROLES AGE VERSION
    2. provisioner.openshift.example.com Ready master 30h v1.22.1
    3. openshift-master-1.openshift.example.com Ready master 30h v1.22.1
    4. openshift-master-2.openshift.example.com Ready master 30h v1.22.1
    5. openshift-master-3.openshift.example.com Ready master 30h v1.22.1
    6. openshift-worker-0.openshift.example.com Ready master 30h v1.22.1
    7. openshift-worker-1.openshift.example.com Ready master 30h v1.22.1
  3. Get the machine set.

    1. $ oc get machinesets -n openshift-machine-api
    1. NAME DESIRED CURRENT READY AVAILABLE AGE
    2. ...
    3. openshift-worker-0.example.com 1 1 1 1 55m
    4. openshift-worker-1.example.com 1 1 1 1 55m
  4. Increase the number of worker nodes by one.

    1. $ oc scale --replicas=<num> machineset <machineset> -n openshift-machine-api

    Replace <num> with the new number of worker nodes. Replace <machineset> with the name of the machine set from the previous step.

  5. Check the status of the bare metal node.

    1. $ oc -n openshift-machine-api get bmh openshift-worker-<num>

    Where <num> is the worker node number. The status changes from ready to provisioning.

    1. NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
    2. openshift-worker-<num> OK provisioning openshift-worker-<num>-65tjz ipmi://<out-of-band-ip> unknown true

    The provisioning status remains until the OKD cluster provisions the node. This can take 30 minutes or more. After the node is provisioned, the status will change to provisioned.

    1. NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
    2. openshift-worker-<num> OK provisioned openshift-worker-<num>-65tjz ipmi://<out-of-band-ip> unknown true
  6. After provisioning completes, ensure the bare metal node is ready.

    1. $ oc get nodes
    1. NAME STATUS ROLES AGE VERSION
    2. provisioner.openshift.example.com Ready master 30h v1.22.1
    3. openshift-master-1.openshift.example.com Ready master 30h v1.22.1
    4. openshift-master-2.openshift.example.com Ready master 30h v1.22.1
    5. openshift-master-3.openshift.example.com Ready master 30h v1.22.1
    6. openshift-worker-0.openshift.example.com Ready master 30h v1.22.1
    7. openshift-worker-1.openshift.example.com Ready master 30h v1.22.1
    8. openshift-worker-<num>.openshift.example.com Ready worker 3m27s v1.22.1

    You can also check the kubelet.

    1. $ ssh openshift-worker-<num>
    1. [kni@openshift-worker-<num>]$ journalctl -fu kubelet