Viewing and listing the nodes in your OKD cluster

You can list all the nodes in your cluster to obtain information such as status, age, memory usage, and details about the nodes.

When you perform node management operations, the CLI interacts with node objects that are representations of actual node hosts. The master uses the information from node objects to validate nodes with health checks.

About listing all the nodes in a cluster

You can get detailed information on the nodes in the cluster.

  • The following command lists all nodes:

    1. $ oc get nodes

    The following example is a cluster with healthy nodes:

    1. $ oc get nodes

    Example output

    1. NAME STATUS ROLES AGE VERSION
    2. master.example.com Ready master 7h v1.23.0
    3. node1.example.com Ready worker 7h v1.23.0
    4. node2.example.com Ready worker 7h v1.23.0

    The following example is a cluster with one unhealthy node:

    1. $ oc get nodes

    Example output

    1. NAME STATUS ROLES AGE VERSION
    2. master.example.com Ready master 7h v1.23.0
    3. node1.example.com NotReady,SchedulingDisabled worker 7h v1.23.0
    4. node2.example.com Ready worker 7h v1.23.0

    The conditions that trigger a NotReady status are shown later in this section.

  • The -o wide option provides additional information on nodes.

    1. $ oc get nodes -o wide

    Example output

    1. NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
    2. master.example.com Ready master 171m v1.23.0 10.0.129.108 <none> Red Hat Enterprise Linux CoreOS 48.83.202103210901-0 (Ootpa) 4.18.0-240.15.1.el8_3.x86_64 cri-o://1.23.0-30.rhaos4.10.gitf2f339d.el8-dev
    3. node1.example.com Ready worker 72m v1.23.0 10.0.129.222 <none> Red Hat Enterprise Linux CoreOS 48.83.202103210901-0 (Ootpa) 4.18.0-240.15.1.el8_3.x86_64 cri-o://1.23.0-30.rhaos4.10.gitf2f339d.el8-dev
    4. node2.example.com Ready worker 164m v1.23.0 10.0.142.150 <none> Red Hat Enterprise Linux CoreOS 48.83.202103210901-0 (Ootpa) 4.18.0-240.15.1.el8_3.x86_64 cri-o://1.23.0-30.rhaos4.10.gitf2f339d.el8-dev
  • The following command lists information about a single node:

    1. $ oc get node <node>

    For example:

    1. $ oc get node node1.example.com

    Example output

    1. NAME STATUS ROLES AGE VERSION
    2. node1.example.com Ready worker 7h v1.23.0
  • The following command provides more detailed information about a specific node, including the reason for the current condition:

    1. $ oc describe node <node>

    For example:

    1. $ oc describe node node1.example.com

    Example output

    1. Name: node1.example.com (1)
    2. Roles: worker (2)
    3. Labels: beta.kubernetes.io/arch=amd64 (3)
    4. beta.kubernetes.io/instance-type=m4.large
    5. beta.kubernetes.io/os=linux
    6. failure-domain.beta.kubernetes.io/region=us-east-2
    7. failure-domain.beta.kubernetes.io/zone=us-east-2a
    8. kubernetes.io/hostname=ip-10-0-140-16
    9. node-role.kubernetes.io/worker=
    10. Annotations: cluster.k8s.io/machine: openshift-machine-api/ahardin-worker-us-east-2a-q5dzc (4)
    11. machineconfiguration.openshift.io/currentConfig: worker-309c228e8b3a92e2235edd544c62fea8
    12. machineconfiguration.openshift.io/desiredConfig: worker-309c228e8b3a92e2235edd544c62fea8
    13. machineconfiguration.openshift.io/state: Done
    14. volumes.kubernetes.io/controller-managed-attach-detach: true
    15. CreationTimestamp: Wed, 13 Feb 2019 11:05:57 -0500
    16. Taints: <none> (5)
    17. Unschedulable: false
    18. Conditions: (6)
    19. Type Status LastHeartbeatTime LastTransitionTime Reason Message
    20. ---- ------ ----------------- ------------------ ------ -------
    21. OutOfDisk False Wed, 13 Feb 2019 15:09:42 -0500 Wed, 13 Feb 2019 11:05:57 -0500 KubeletHasSufficientDisk kubelet has sufficient disk space available
    22. MemoryPressure False Wed, 13 Feb 2019 15:09:42 -0500 Wed, 13 Feb 2019 11:05:57 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available
    23. DiskPressure False Wed, 13 Feb 2019 15:09:42 -0500 Wed, 13 Feb 2019 11:05:57 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure
    24. PIDPressure False Wed, 13 Feb 2019 15:09:42 -0500 Wed, 13 Feb 2019 11:05:57 -0500 KubeletHasSufficientPID kubelet has sufficient PID available
    25. Ready True Wed, 13 Feb 2019 15:09:42 -0500 Wed, 13 Feb 2019 11:07:09 -0500 KubeletReady kubelet is posting ready status
    26. Addresses: (7)
    27. InternalIP: 10.0.140.16
    28. InternalDNS: ip-10-0-140-16.us-east-2.compute.internal
    29. Hostname: ip-10-0-140-16.us-east-2.compute.internal
    30. Capacity: (8)
    31. attachable-volumes-aws-ebs: 39
    32. cpu: 2
    33. hugepages-1Gi: 0
    34. hugepages-2Mi: 0
    35. memory: 8172516Ki
    36. pods: 250
    37. Allocatable:
    38. attachable-volumes-aws-ebs: 39
    39. cpu: 1500m
    40. hugepages-1Gi: 0
    41. hugepages-2Mi: 0
    42. memory: 7558116Ki
    43. pods: 250
    44. System Info: (9)
    45. Machine ID: 63787c9534c24fde9a0cde35c13f1f66
    46. System UUID: EC22BF97-A006-4A58-6AF8-0A38DEEA122A
    47. Boot ID: f24ad37d-2594-46b4-8830-7f7555918325
    48. Kernel Version: 3.10.0-957.5.1.el7.x86_64
    49. OS Image: Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa)
    50. Operating System: linux
    51. Architecture: amd64
    52. Container Runtime Version: cri-o://1.16.0-0.6.dev.rhaos4.3.git9ad059b.el8-rc2
    53. Kubelet Version: v1.23.0
    54. Kube-Proxy Version: v1.23.0
    55. PodCIDR: 10.128.4.0/24
    56. ProviderID: aws:///us-east-2a/i-04e87b31dc6b3e171
    57. Non-terminated Pods: (13 in total) (10)
    58. Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
    59. --------- ---- ------------ ---------- --------------- -------------
    60. openshift-cluster-node-tuning-operator tuned-hdl5q 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    61. openshift-dns dns-default-l69zr 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    62. openshift-image-registry node-ca-9hmcg 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    63. openshift-ingress router-default-76455c45c-c5ptv 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    64. openshift-machine-config-operator machine-config-daemon-cvqw9 20m (1%) 0 (0%) 50Mi (0%) 0 (0%)
    65. openshift-marketplace community-operators-f67fh 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    66. openshift-monitoring alertmanager-main-0 50m (3%) 50m (3%) 210Mi (2%) 10Mi (0%)
    67. openshift-monitoring grafana-78765ddcc7-hnjmm 100m (6%) 200m (13%) 100Mi (1%) 200Mi (2%)
    68. openshift-monitoring node-exporter-l7q8d 10m (0%) 20m (1%) 20Mi (0%) 40Mi (0%)
    69. openshift-monitoring prometheus-adapter-75d769c874-hvb85 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    70. openshift-multus multus-kw8w5 0 (0%) 0 (0%) 0 (0%) 0 (0%)
    71. openshift-sdn ovs-t4dsn 100m (6%) 0 (0%) 300Mi (4%) 0 (0%)
    72. openshift-sdn sdn-g79hg 100m (6%) 0 (0%) 200Mi (2%) 0 (0%)
    73. Allocated resources:
    74. (Total limits may be over 100 percent, i.e., overcommitted.)
    75. Resource Requests Limits
    76. -------- -------- ------
    77. cpu 380m (25%) 270m (18%)
    78. memory 880Mi (11%) 250Mi (3%)
    79. attachable-volumes-aws-ebs 0 0
    80. Events: (11)
    81. Type Reason Age From Message
    82. ---- ------ ---- ---- -------
    83. Normal NodeHasSufficientPID 6d (x5 over 6d) kubelet, m01.example.com Node m01.example.com status is now: NodeHasSufficientPID
    84. Normal NodeAllocatableEnforced 6d kubelet, m01.example.com Updated Node Allocatable limit across pods
    85. Normal NodeHasSufficientMemory 6d (x6 over 6d) kubelet, m01.example.com Node m01.example.com status is now: NodeHasSufficientMemory
    86. Normal NodeHasNoDiskPressure 6d (x6 over 6d) kubelet, m01.example.com Node m01.example.com status is now: NodeHasNoDiskPressure
    87. Normal NodeHasSufficientDisk 6d (x6 over 6d) kubelet, m01.example.com Node m01.example.com status is now: NodeHasSufficientDisk
    88. Normal NodeHasSufficientPID 6d kubelet, m01.example.com Node m01.example.com status is now: NodeHasSufficientPID
    89. Normal Starting 6d kubelet, m01.example.com Starting kubelet.
    90. ...
    1The name of the node.
    2The role of the node, either master or worker.
    3The labels applied to the node.
    4The annotations applied to the node.
    5The taints applied to the node.
    6The node conditions and status. The conditions stanza lists the Ready, PIDPressure, PIDPressure, MemoryPressure, DiskPressure and OutOfDisk status. These condition are described later in this section.
    7The IP address and hostname of the node.
    8The pod resources and allocatable resources.
    9Information about the node host.
    10The pods on the node.
    11The events reported by the node.

Among the information shown for nodes, the following node conditions appear in the output of the commands shown in this section:

Table 1. Node Conditions
ConditionDescription

Ready

If true, the node is healthy and ready to accept pods. If false, the node is not healthy and is not accepting pods. If unknown, the node controller has not received a heartbeat from the node for the node-monitor-grace-period (the default is 40 seconds).

DiskPressure

If true, the disk capacity is low.

MemoryPressure

If true, the node memory is low.

PIDPressure

If true, there are too many processes on the node.

OutOfDisk

If true, the node has insufficient free space on the node for adding new pods.

NetworkUnavailable

If true, the network for the node is not correctly configured.

NotReady

If true, one of the underlying components, such as the container runtime or network, is experiencing issues or is not yet configured.

SchedulingDisabled

Pods cannot be scheduled for placement on the node.

Listing pods on a node in your cluster

You can list all the pods on a specific node.

Procedure

  • To list all or selected pods on one or more nodes:

    1. $ oc describe node <node1> <node2>

    For example:

    1. $ oc describe node ip-10-0-128-218.ec2.internal
  • To list all or selected pods on selected nodes:

    1. $ oc describe --selector=<node_selector>
    1. $ oc describe node --selector=kubernetes.io/os

    Or:

    1. $ oc describe -l=<pod_selector>
    1. $ oc describe node -l node-role.kubernetes.io/worker
  • To list all pods on a specific node, including terminated pods:

    1. $ oc get pod --all-namespaces --field-selector=spec.nodeName=<nodename>

Viewing memory and CPU usage statistics on your nodes

You can display usage statistics about nodes, which provide the runtime environments for containers. These usage statistics include CPU, memory, and storage consumption.

Prerequisites

  • You must have cluster-reader permission to view the usage statistics.

  • Metrics must be installed to view the usage statistics.

Procedure

  • To view the usage statistics:

    1. $ oc adm top nodes

    Example output

    1. NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
    2. ip-10-0-12-143.ec2.compute.internal 1503m 100% 4533Mi 61%
    3. ip-10-0-132-16.ec2.compute.internal 76m 5% 1391Mi 18%
    4. ip-10-0-140-137.ec2.compute.internal 398m 26% 2473Mi 33%
    5. ip-10-0-142-44.ec2.compute.internal 656m 43% 6119Mi 82%
    6. ip-10-0-146-165.ec2.compute.internal 188m 12% 3367Mi 45%
    7. ip-10-0-19-62.ec2.compute.internal 896m 59% 5754Mi 77%
    8. ip-10-0-44-193.ec2.compute.internal 632m 42% 5349Mi 72%
  • To view the usage statistics for nodes with labels:

    1. $ oc adm top node --selector=''

    You must choose the selector (label query) to filter on. Supports =, ==, and !=.