Controlling pod placement on nodes using node affinity rules

Affinity is a property of pods that controls the nodes on which they prefer to be scheduled.

In OKD node affinity is a set of rules used by the scheduler to determine where a pod can be placed. The rules are defined using custom labels on the nodes and label selectors specified in pods.

Understanding node affinity

Node affinity allows a pod to specify an affinity towards a group of nodes it can be placed on. The node does not have control over the placement.

For example, you could configure a pod to only run on a node with a specific CPU or in a specific availability zone.

There are two types of node affinity rules: required and preferred.

Required rules must be met before a pod can be scheduled on a node. Preferred rules specify that, if the rule is met, the scheduler tries to enforce the rules, but does not guarantee enforcement.

If labels on a node change at runtime that results in an node affinity rule on a pod no longer being met, the pod continues to run on the node.

You configure node affinity through the Pod spec file. You can specify a required rule, a preferred rule, or both. If you specify both, the node must first meet the required rule, then attempts to meet the preferred rule.

The following example is a Pod spec with a rule that requires the pod be placed on a node with a label whose key is e2e-az-NorthSouth and whose value is either e2e-az-North or e2e-az-South:

Example pod configuration file with a node affinity required rule

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: with-node-affinity
  5. spec:
  6. affinity:
  7. nodeAffinity: (1)
  8. requiredDuringSchedulingIgnoredDuringExecution: (2)
  9. nodeSelectorTerms:
  10. - matchExpressions:
  11. - key: e2e-az-NorthSouth (3)
  12. operator: In (4)
  13. values:
  14. - e2e-az-North (3)
  15. - e2e-az-South (3)
  16. containers:
  17. - name: with-node-affinity
  18. image: docker.io/ocpqe/hello-pod
1The stanza to configure node affinity.
2Defines a required rule.
3The key/value pair (label) that must be matched to apply the rule.
4The operator represents the relationship between the label on the node and the set of values in the matchExpression parameters in the Pod spec. This value can be In, NotIn, Exists, or DoesNotExist, Lt, or Gt.

The following example is a node specification with a preferred rule that a node with a label whose key is e2e-az-EastWest and whose value is either e2e-az-East or e2e-az-West is preferred for the pod:

Example pod configuration file with a node affinity preferred rule

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: with-node-affinity
  5. spec:
  6. affinity:
  7. nodeAffinity: (1)
  8. preferredDuringSchedulingIgnoredDuringExecution: (2)
  9. - weight: 1 (3)
  10. preference:
  11. matchExpressions:
  12. - key: e2e-az-EastWest (4)
  13. operator: In (5)
  14. values:
  15. - e2e-az-East (4)
  16. - e2e-az-West (4)
  17. containers:
  18. - name: with-node-affinity
  19. image: docker.io/ocpqe/hello-pod
1The stanza to configure node affinity.
2Defines a preferred rule.
3Specifies a weight for a preferred rule. The node with highest weight is preferred.
4The key/value pair (label) that must be matched to apply the rule.
5The operator represents the relationship between the label on the node and the set of values in the matchExpression parameters in the Pod spec. This value can be In, NotIn, Exists, or DoesNotExist, Lt, or Gt.

There is no explicit node anti-affinity concept, but using the NotIn or DoesNotExist operator replicates that behavior.

If you are using node affinity and node selectors in the same pod configuration, note the following:

  • If you configure both nodeSelector and nodeAffinity, both conditions must be satisfied for the pod to be scheduled onto a candidate node.

  • If you specify multiple nodeSelectorTerms associated with nodeAffinity types, then the pod can be scheduled onto a node if one of the nodeSelectorTerms is satisfied.

  • If you specify multiple matchExpressions associated with nodeSelectorTerms, then the pod can be scheduled onto a node only if all matchExpressions are satisfied.

Configuring a required node affinity rule

Required rules must be met before a pod can be scheduled on a node.

Procedure

The following steps demonstrate a simple configuration that creates a node and a pod that the scheduler is required to place on the node.

  1. Add a label to a node using the oc label node command:

    1. $ oc label node node1 e2e-az-name=e2e-az1
  2. In the Pod spec, use the nodeAffinity stanza to configure the requiredDuringSchedulingIgnoredDuringExecution parameter:

    1. Specify the key and values that must be met. If you want the new pod to be scheduled on the node you edited, use the same key and value parameters as the label in the node.

    2. Specify an operator. The operator can be In, NotIn, Exists, DoesNotExist, Lt, or Gt. For example, use the operator In to require the label to be in the node:

      Example output

      1. spec:
      2. affinity:
      3. nodeAffinity:
      4. requiredDuringSchedulingIgnoredDuringExecution:
      5. nodeSelectorTerms:
      6. - matchExpressions:
      7. - key: e2e-az-name
      8. operator: In
      9. values:
      10. - e2e-az1
      11. - e2e-az2
  3. Create the pod:

    1. $ oc create -f e2e-az2.yaml

Configuring a preferred node affinity rule

Preferred rules specify that, if the rule is met, the scheduler tries to enforce the rules, but does not guarantee enforcement.

Procedure

The following steps demonstrate a simple configuration that creates a node and a pod that the scheduler tries to place on the node.

  1. Add a label to a node using the oc label node command:

    1. $ oc label node node1 e2e-az-name=e2e-az3
  2. In the Pod spec, use the nodeAffinity stanza to configure the preferredDuringSchedulingIgnoredDuringExecution parameter:

    1. Specify a weight for the node, as a number 1-100. The node with highest weight is preferred.

    2. Specify the key and values that must be met. If you want the new pod to be scheduled on the node you edited, use the same key and value parameters as the label in the node:

      1. spec:
      2. affinity:
      3. nodeAffinity:
      4. preferredDuringSchedulingIgnoredDuringExecution:
      5. - weight: 1
      6. preference:
      7. matchExpressions:
      8. - key: e2e-az-name
      9. operator: In
      10. values:
      11. - e2e-az3
    3. Specify an operator. The operator can be In, NotIn, Exists, DoesNotExist, Lt, or Gt. For example, use the Operator In to require the label to be in the node.

  3. Create the pod.

    1. $ oc create -f e2e-az3.yaml

Sample node affinity rules

The following examples demonstrate node affinity.

Node affinity with matching labels

The following example demonstrates node affinity for a node and pod with matching labels:

  • The Node1 node has the label zone:us:

    1. $ oc label node node1 zone=us
  • The pod-s1 pod has the zone and us key/value pair under a required node affinity rule:

    1. $ cat pod-s1.yaml

    Example output

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: pod-s1
    5. spec:
    6. containers:
    7. - image: "docker.io/ocpqe/hello-pod"
    8. name: hello-pod
    9. affinity:
    10. nodeAffinity:
    11. requiredDuringSchedulingIgnoredDuringExecution:
    12. nodeSelectorTerms:
    13. - matchExpressions:
    14. - key: "zone"
    15. operator: In
    16. values:
    17. - us
  • The pod-s1 pod can be scheduled on Node1:

    1. $ oc get pod -o wide

    Example output

    1. NAME READY STATUS RESTARTS AGE IP NODE
    2. pod-s1 1/1 Running 0 4m IP1 node1

Node affinity with no matching labels

The following example demonstrates node affinity for a node and pod without matching labels:

  • The Node1 node has the label zone:emea:

    1. $ oc label node node1 zone=emea
  • The pod-s1 pod has the zone and us key/value pair under a required node affinity rule:

    1. $ cat pod-s1.yaml

    Example output

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: pod-s1
    5. spec:
    6. containers:
    7. - image: "docker.io/ocpqe/hello-pod"
    8. name: hello-pod
    9. affinity:
    10. nodeAffinity:
    11. requiredDuringSchedulingIgnoredDuringExecution:
    12. nodeSelectorTerms:
    13. - matchExpressions:
    14. - key: "zone"
    15. operator: In
    16. values:
    17. - us
  • The pod-s1 pod cannot be scheduled on Node1:

    1. $ oc describe pod pod-s1

    Example output

    1. ...
    2. Events:
    3. FirstSeen LastSeen Count From SubObjectPath Type Reason
    4. --------- -------- ----- ---- ------------- -------- ------
    5. 1m 33s 8 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: MatchNodeSelector (1).

Additional resources

For information about changing node labels, see Understanding how to update labels on nodes.