Managing machines with the Cluster API

Managing machines with the Cluster API is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

The Cluster API is an upstream project that is integrated into OKD as a Technology Preview for Amazon Web Services (AWS) and Google Cloud Platform (GCP). You can use the Cluster API to create and manage compute machine sets and compute machines in your OKD cluster. This capability is in addition or an alternative to managing machines with the Machine API.

For OKD 4 clusters, you can use the Cluster API to perform node host provisioning management actions after the cluster installation finishes. This system enables an elastic, dynamic provisioning method on top of public or private cloud infrastructure.

With the Cluster API Technology Preview, you can create compute machines and compute machine sets on OKD clusters for supported providers. You can also explore the features that are enabled by this implementation that might not be available with the Machine API.

Benefits

By using the Cluster API, OKD users and developers are able to realize the following advantages:

  • The option to use upstream community Cluster API infrastructure providers which might not be supported by the Machine API.

  • The opportunity to collaborate with third parties who maintain machine controllers for infrastructure providers.

  • The ability to use the same set of Kubernetes tools for infrastructure management in OKD.

  • The ability to create compute machine sets by using the Cluster API that support features that are not available with the Machine API.

Limitations

Using the Cluster API to manage machines is a Technology Preview feature and has the following limitations:

  • Only AWS and GCP clusters are supported.

  • To use this feature, you must enable the TechPreviewNoUpgrade feature set. Enabling this feature set cannot be undone and prevents minor version updates.

  • You must create the primary resources that the Cluster API requires manually.

  • You cannot manage control plane machines by using the Cluster API.

  • Migration of existing compute machine sets created by the Machine API to Cluster API compute machine sets is not supported.

  • Full feature parity with the Machine API is not available.

Cluster API architecture

The OKD integration of the upstream Cluster API is implemented and managed by the Cluster CAPI Operator. The Cluster CAPI Operator and its operands are provisioned in the openshift-cluster-api namespace, in contrast to the Machine API, which uses the openshift-machine-api namespace.

The Cluster CAPI Operator

The Cluster CAPI Operator is an OKD Operator that maintains the lifecycle of Cluster API resources. This Operator is responsible for all administrative tasks related to deploying the Cluster API project within an OKD cluster.

If a cluster is configured correctly to allow the use of the Cluster API, the Cluster CAPI Operator installs the Cluster API components on the cluster.

For more information, see the entry for the Cluster CAPI Operator in the Cluster Operators reference content.

Primary resources

The Cluster API consists of the following primary resources. For the Technology Preview of this feature, you must create these resources manually in the openshift-cluster-api namespace.

Cluster

A fundamental unit that represents a cluster that is managed by the Cluster API.

Infrastructure

A provider-specific resource that defines properties that are shared by all the compute machine sets in the cluster, such as the region and subnets.

Machine template

A provider-specific template that defines the properties of the machines that a compute machine set creates.

Machine set

A group of machines.

Compute machine sets are to machines as replica sets are to pods. If you need more machines or must scale them down, you change the replicas field on the compute machine set to meet your compute needs.

With the Cluster API, a compute machine set references a Cluster object and a provider-specific machine template.

Machine

A fundamental unit that describes the host for a node.

The Cluster API creates machines based on the configuration in the machine template.

Additional resources

Sample YAML files

For the Cluster API Technology Preview, you must create the primary resources that the Cluster API requires manually. The following example YAML files show how to make these resources work together and configure settings for the machines that they create that are appropriate for your environment.

Sample YAML for a Cluster API cluster resource

The cluster resource defines the name and infrastructure provider for the cluster and is managed by the Cluster API. This resource has the same structure for all providers.

  1. apiVersion: cluster.x-k8s.io/v1beta1
  2. kind: Cluster
  3. metadata:
  4. name: <cluster_name> (1)
  5. namespace: openshift-cluster-api
  6. spec:
  7. infrastructureRef:
  8. apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  9. kind: <infrastructure_kind> (2)
  10. name: <cluster_name> (1)
  11. namespace: openshift-cluster-api
1Specify the name of the cluster.
2Specify the infrastructure kind for the cluster. Valid values are:
  • AWSCluster: The cluster is running on Amazon Web Services (AWS).

  • GCPCluster: The cluster is running on Google Cloud Platform (GCP).

The remaining Cluster API resources are provider-specific. Refer to the example YAML files for your cluster:

Sample YAML files for configuring Amazon Web Services clusters

Some Cluster API resources are provider-specific. The following example YAML files show configurations for an Amazon Web Services (AWS) cluster.

Sample YAML for a Cluster API infrastructure resource on Amazon Web Services

The infrastructure resource is provider-specific and defines properties that are shared by all the compute machine sets in the cluster, such as the region and subnets. The compute machine set references this resource when creating machines.

  1. apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  2. kind: AWSCluster (1)
  3. metadata:
  4. name: <cluster_name> (2)
  5. namespace: openshift-cluster-api
  6. spec:
  7. region: <region> (3)
1Specify the infrastructure kind for the cluster. This value must match the value for your platform.
2Specify the name of the cluster.
3Specify the AWS region.

Sample YAML for a Cluster API machine template resource on Amazon Web Services

The machine template resource is provider-specific and defines the basic properties of the machines that a compute machine set creates. The compute machine set references this template when creating machines.

  1. apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
  2. kind: AWSMachineTemplate (1)
  3. metadata:
  4. name: <template_name> (2)
  5. namespace: openshift-cluster-api
  6. spec:
  7. template:
  8. spec: (3)
  9. uncompressedUserData: true
  10. iamInstanceProfile: ....
  11. instanceType: m5.large
  12. cloudInit:
  13. insecureSkipSecretsManager: true
  14. ami:
  15. id: ....
  16. subnet:
  17. filters:
  18. - name: tag:Name
  19. values:
  20. - ...
  21. additionalSecurityGroups:
  22. - filters:
  23. - name: tag:Name
  24. values:
  25. - ...
1Specify the machine template kind. This value must match the value for your platform.
2Specify a name for the machine template.
3Specify the details for your environment. The values here are examples.

Sample YAML for a Cluster API compute machine set resource on Amazon Web Services

The compute machine set resource defines additional properties of the machines that it creates. The compute machine set also references the infrastructure resource and machine template when creating machines.

  1. apiVersion: cluster.x-k8s.io/v1alpha4
  2. kind: MachineSet
  3. metadata:
  4. name: <machine_set_name> (1)
  5. namespace: openshift-cluster-api
  6. spec:
  7. clusterName: <cluster_name> (2)
  8. replicas: 1
  9. selector:
  10. matchLabels:
  11. test: example
  12. template:
  13. metadata:
  14. labels:
  15. test: example
  16. spec:
  17. bootstrap:
  18. dataSecretName: worker-user-data (3)
  19. clusterName: <cluster_name> (2)
  20. infrastructureRef:
  21. apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
  22. kind: AWSMachineTemplate (4)
  23. name: <cluster_name> (2)
1Specify a name for the compute machine set.
2Specify the name of the cluster.
3For the Cluster API Technology Preview, the Operator can use the worker user data secret from openshift-machine-api namespace.
4Specify the machine template kind. This value must match the value for your platform.

Sample YAML files for configuring Google Cloud Platform clusters

Some Cluster API resources are provider-specific. The following example YAML files show configurations for a Google Cloud Platform (GCP) cluster.

Sample YAML for a Cluster API infrastructure resource on Google Cloud Platform

The infrastructure resource is provider-specific and defines properties that are shared by all the compute machine sets in the cluster, such as the region and subnets. The compute machine set references this resource when creating machines.

  1. apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  2. kind: GCPCluster (1)
  3. metadata:
  4. name: <cluster_name> (2)
  5. spec:
  6. network:
  7. name: <cluster_name>-network (2)
  8. project: <project> (3)
  9. region: <region> (4)
1Specify the infrastructure kind for the cluster. This value must match the value for your platform.
2Specify the name of the cluster.
3Specify the GCP project name.
4Specify the GCP region.

Sample YAML for a Cluster API machine template resource on Google Cloud Platform

The machine template resource is provider-specific and defines the basic properties of the machines that a compute machine set creates. The compute machine set references this template when creating machines.

  1. apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  2. kind: GCPMachineTemplate (1)
  3. metadata:
  4. name: <template_name> (2)
  5. namespace: openshift-cluster-api
  6. spec:
  7. template:
  8. spec: (3)
  9. rootDeviceType: pd-ssd
  10. rootDeviceSize: 128
  11. instanceType: n1-standard-4
  12. image: projects/rhcos-cloud/global/images/rhcos-411-85-202203181601-0-gcp-x86-64
  13. subnet: <cluster_name>-worker-subnet
  14. serviceAccounts:
  15. email: <service_account_email_address>
  16. scopes:
  17. - https://www.googleapis.com/auth/cloud-platform
  18. additionalLabels:
  19. kubernetes-io-cluster-<cluster_name>: owned
  20. additionalNetworkTags:
  21. - <cluster_name>-worker
  22. ipForwarding: Disabled
1Specify the machine template kind. This value must match the value for your platform.
2Specify a name for the machine template.
3Specify the details for your environment. The values here are examples.

Sample YAML for a Cluster API compute machine set resource on Google Cloud Platform

The compute machine set resource defines additional properties of the machines that it creates. The compute machine set also references the infrastructure resource and machine template when creating machines.

  1. apiVersion: cluster.x-k8s.io/v1beta1
  2. kind: MachineSet
  3. metadata:
  4. name: <machine_set_name> (1)
  5. namespace: openshift-cluster-api
  6. spec:
  7. clusterName: <cluster_name> (2)
  8. replicas: 1
  9. selector:
  10. matchLabels:
  11. test: test
  12. template:
  13. metadata:
  14. labels:
  15. test: test
  16. spec:
  17. bootstrap:
  18. dataSecretName: worker-user-data (3)
  19. clusterName: <cluster_name> (2)
  20. infrastructureRef:
  21. apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  22. kind: GCPMachineTemplate (4)
  23. name: <machine_set_name> (1)
  24. failureDomain: <failure_domain> (5)
1Specify a name for the compute machine set.
2Specify the name of the cluster.
3For the Cluster API Technology Preview, the Operator can use the worker user data secret from openshift-machine-api namespace.
4Specify the machine template kind. This value must match the value for your platform.
5Specify the failure domain within the GCP region.

Creating a Cluster API compute machine set

You can create compute machine sets that use the Cluster API to dynamically manage the machine compute resources for specific workloads of your choice.

Prerequisites

  • Deploy an OKD cluster.

  • Enable the use of the Cluster API.

  • Install the OpenShift CLI (oc).

  • Log in to oc as a user with cluster-admin permission.

Procedure

  1. Create a YAML file that contains the cluster custom resource (CR) and is named <cluster_resource_file>.yaml.

    If you are not sure which value to set for the <cluster_name> parameter, you can check the value for an existing Machine API compute machine set in your cluster.

    1. To list the Machine API compute machine sets, run the following command:

      1. $ oc get machinesets -n openshift-machine-api (1)
      1Specify the openshift-machine-api namespace.

      Example output

      1. NAME DESIRED CURRENT READY AVAILABLE AGE
      2. agl030519-vplxk-worker-us-east-1a 1 1 1 1 55m
      3. agl030519-vplxk-worker-us-east-1b 1 1 1 1 55m
      4. agl030519-vplxk-worker-us-east-1c 1 1 1 1 55m
      5. agl030519-vplxk-worker-us-east-1d 0 0 55m
      6. agl030519-vplxk-worker-us-east-1e 0 0 55m
      7. agl030519-vplxk-worker-us-east-1f 0 0 55m
    2. To display the contents of a specific compute machine set CR, run the following command:

      1. $ oc get machineset <machineset_name> \
      2. -n openshift-machine-api \
      3. -o yaml

      Example output

      1. ...
      2. template:
      3. metadata:
      4. labels:
      5. machine.openshift.io/cluster-api-cluster: agl030519-vplxk (1)
      6. machine.openshift.io/cluster-api-machine-role: worker
      7. machine.openshift.io/cluster-api-machine-type: worker
      8. machine.openshift.io/cluster-api-machineset: agl030519-vplxk-worker-us-east-1a
      9. ...
      1The cluster ID, which you use for the <cluster_name> parameter.
  2. Create the cluster CR by running the following command:

    1. $ oc create -f <cluster_resource_file>.yaml

    Verification

    To confirm that the cluster CR is created, run the following command:

    1. $ oc get cluster

    Example output

    1. NAME PHASE AGE VERSION
    2. <cluster_name> Provisioning 4h6m
  3. Create a YAML file that contains the infrastructure CR and is named <infrastructure_resource_file>.yaml.

  4. Create the infrastructure CR by running the following command:

    1. $ oc create -f <infrastructure_resource_file>.yaml

    Verification

    To confirm that the infrastructure CR is created, run the following command:

    1. $ oc get <infrastructure_kind>

    where <infrastructure_kind> is the value that corresponds to your platform.

    Example output

    1. NAME CLUSTER READY VPC BASTION IP
    2. <cluster_name> <cluster_name> true
  5. Create a YAML file that contains the machine template CR and is named <machine_template_resource_file>.yaml.

  6. Create the machine template CR by running the following command:

    1. $ oc create -f <machine_template_resource_file>.yaml

    Verification

    To confirm that the machine template CR is created, run the following command:

    1. $ oc get <machine_template_kind>

    where <machine_template_kind> is the value that corresponds to your platform.

    Example output

    1. NAME AGE
    2. <template_name> 77m
  7. Create a YAML file that contains the compute machine set CR and is named <machine_set_resource_file>.yaml.

  8. Create the compute machine set CR by running the following command:

    1. $ oc create -f <machine_set_resource_file>.yaml

    Verification

    To confirm that the compute machine set CR is created, run the following command:

    1. $ oc get machineset -n openshift-cluster-api (1)
    1Specify the openshift-cluster-api namespace.

    Example output

    1. NAME CLUSTER REPLICAS READY AVAILABLE AGE VERSION
    2. <machine_set_name> <cluster_name> 1 1 1 17m

    When the new compute machine set is available, the REPLICAS and AVAILABLE values match. If the compute machine set is not available, wait a few minutes and run the command again.

Verification

  • To verify that the compute machine set is creating machines according to your desired configuration, you can review the lists of machines and nodes in the cluster.

    • To view the list of Cluster API machines, run the following command:

      1. $ oc get machine -n openshift-cluster-api (1)
      1Specify the openshift-cluster-api namespace.

      Example output

      1. NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
      2. <machine_set_name>-<string_id> <cluster_name> <ip_address>.<region>.compute.internal <provider_id> Running 8m23s
    • To view the list of nodes, run the following command:

      1. $ oc get node

      Example output

      1. NAME STATUS ROLES AGE VERSION
      2. <ip_address_1>.<region>.compute.internal Ready worker 5h14m v1.28.5
      3. <ip_address_2>.<region>.compute.internal Ready master 5h19m v1.28.5
      4. <ip_address_3>.<region>.compute.internal Ready worker 7m v1.28.5

Troubleshooting clusters that use the Cluster API

Use the information in this section to understand and recover from issues you might encounter. Generally, troubleshooting steps for problems with the Cluster API are similar to those steps for problems with the Machine API.

The Cluster CAPI Operator and its operands are provisioned in the openshift-cluster-api namespace, whereas the Machine API uses the openshift-machine-api namespace. When using oc commands that reference a namespace, be sure to reference the correct one.

CLI commands return Cluster API machines

For clusters that use the Cluster API, oc commands such as oc get machine return results for Cluster API machines. Because the letter c precedes the letter m alphabetically, Cluster API machines appear in the return before Machine API machines do.

  • To list only Machine API machines, use the fully qualified name machines.machine.openshift.io when running the oc get machine command:

    1. $ oc get machines.machine.openshift.io
  • To list only Cluster API machines, use the fully qualified name machines.cluster.x-k8s.io when running the oc get machine command:

    1. $ oc get machines.cluster.x-k8s.io