Verifying connectivity to an endpoint

The Cluster Network Operator (CNO) runs a controller, the connectivity check controller, that performs a connection health check between resources within your cluster. By reviewing the results of the health checks, you can diagnose connection problems or eliminate network connectivity as the cause of an issue that you are investigating.

Connection health checks performed

To verify that cluster resources are reachable, a TCP connection is made to each of the following cluster API services:

  • Kubernetes API server service

  • Kubernetes API server endpoints

  • OpenShift API server service

  • OpenShift API server endpoints

  • Load balancers

To verify that services and service endpoints are reachable on every node in the cluster, a TCP connection is made to each of the following targets:

  • Health check target service

  • Health check target endpoints

Implementation of connection health checks

The connectivity check controller orchestrates connection verification checks in your cluster. The results for the connection tests are stored in PodNetworkConnectivity objects in the openshift-network-diagnostics namespace. Connection tests are performed every minute in parallel.

The Cluster Network Operator (CNO) deploys several resources to the cluster to send and receive connectivity health checks:

Health check source

This program deploys in a single pod replica set managed by a Deployment object. The program consumes PodNetworkConnectivity objects and connects to the spec.targetEndpoint specified in each object.

Health check target

A pod deployed as part of a daemon set on every node in the cluster. The pod listens for inbound health checks. The presence of this pod on every node allows for the testing of connectivity to each node.

PodNetworkConnectivityCheck object fields

The PodNetworkConnectivityCheck object fields are described in the following tables.

Table 1. PodNetworkConnectivityCheck object fields
FieldTypeDescription

metadata.name

string

The name of the object in the following format: <source>-to-<target>. The destination described by <target> includes one of following strings:

  • load-balancer-api-external

  • load-balancer-api-internal

  • kubernetes-apiserver-endpoint

  • kubernetes-apiserver-service-cluster

  • network-check-target

  • openshift-apiserver-endpoint

  • openshift-apiserver-service-cluster

metadata.namespace

string

The namespace that the object is associated with. This value is always openshift-network-diagnostics.

spec.sourcePod

string

The name of the pod where the connection check originates, such as network-check-source-596b4c6566-rgh92.

spec.targetEndpoint

string

The target of the connection check, such as api.devcluster.example.com:6443.

spec.tlsClientCert

object

Configuration for the TLS certificate to use.

spec.tlsClientCert.name

string

The name of the TLS certificate used, if any. The default value is an empty string.

status

object

An object representing the condition of the connection test and logs of recent connection successes and failures.

status.conditions

array

The latest status of the connection check and any previous statuses.

status.failures

array

Connection test logs from unsuccessful attempts.

status.outages

array

Connect test logs covering the time periods of any outages.

status.successes

array

Connection test logs from successful attempts.

The following table describes the fields for objects in the status.conditions array:

Table 2. status.conditions
FieldTypeDescription

lastTransitionTime

string

The time that the condition of the connection transitioned from one status to another.

message

string

The details about last transition in a human readable format.

reason

string

The last status of the transition in a machine readable format.

status

string

The status of the condition.

type

string

The type of the condition.

The following table describes the fields for objects in the status.conditions array:

Table 3. status.outages
FieldTypeDescription

end

string

The timestamp from when the connection failure is resolved.

endLogs

array

Connection log entries, including the log entry related to the successful end of the outage.

message

string

A summary of outage details in a human readable format.

start

string

The timestamp from when the connection failure is first detected.

startLogs

array

Connection log entries, including the original failure.

Connection log fields

The fields for a connection log entry are described in the following table. The object is used in the following fields:

  • status.failures[]

  • status.successes[]

  • status.outages[].startLogs[]

  • status.outages[].endLogs[]

Table 4. Connection log object
FieldTypeDescription

latency

string

Records the duration of the action.

message

string

Provides the status in a human readable format.

reason

string

Provides the reason for status in a machine readable format. The value is one of TCPConnect, TCPConnectError, DNSResolve, DNSError.

success

boolean

Indicates if the log entry is a success or failure.

time

string

The start time of connection check.

Verifying network connectivity for an endpoint

As a cluster administrator, you can verify the connectivity of an endpoint, such as an API server, load balancer, service, or pod.

Prerequisites

  • Install the OpenShift CLI (oc).

  • Access to the cluster as a user with the cluster-admin role.

Procedure

  1. To list the current PodNetworkConnectivityCheck objects, enter the following command:

    1. $ oc get podnetworkconnectivitycheck -n openshift-network-diagnostics

    Example output

    1. NAME AGE
    2. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0 75m
    3. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-1 73m
    4. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-2 75m
    5. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-kubernetes-apiserver-service-cluster 75m
    6. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-kubernetes-default-service-cluster 75m
    7. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-load-balancer-api-external 75m
    8. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-load-balancer-api-internal 75m
    9. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-ci-ln-x5sv9rb-f76d1-4rzrp-master-0 75m
    10. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-ci-ln-x5sv9rb-f76d1-4rzrp-master-1 75m
    11. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-ci-ln-x5sv9rb-f76d1-4rzrp-master-2 75m
    12. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh 74m
    13. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-ci-ln-x5sv9rb-f76d1-4rzrp-worker-c-n8mbf 74m
    14. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-ci-ln-x5sv9rb-f76d1-4rzrp-worker-d-4hnrz 74m
    15. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-network-check-target-service-cluster 75m
    16. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-openshift-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0 75m
    17. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-openshift-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-1 75m
    18. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-openshift-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-2 74m
    19. network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-openshift-apiserver-service-cluster 75m
  2. View the connection test logs:

    1. From the output of the previous command, identify the endpoint that you want to review the connectivity logs for.

    2. To view the object, enter the following command:

      1. $ oc get podnetworkconnectivitycheck <name> \
      2. -n openshift-network-diagnostics -o yaml

      where <name> specifies the name of the PodNetworkConnectivityCheck object.

      Example output

      1. apiVersion: controlplane.operator.openshift.io/v1alpha1
      2. kind: PodNetworkConnectivityCheck
      3. metadata:
      4. name: network-check-source-ci-ln-x5sv9rb-f76d1-4rzrp-worker-b-6xdmh-to-kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0
      5. namespace: openshift-network-diagnostics
      6. ...
      7. spec:
      8. sourcePod: network-check-source-7c88f6d9f-hmg2f
      9. targetEndpoint: 10.0.0.4:6443
      10. tlsClientCert:
      11. name: ""
      12. status:
      13. conditions:
      14. - lastTransitionTime: "2021-01-13T20:11:34Z"
      15. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      16. connection to 10.0.0.4:6443 succeeded'
      17. reason: TCPConnectSuccess
      18. status: "True"
      19. type: Reachable
      20. failures:
      21. - latency: 2.241775ms
      22. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: failed
      23. to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443: connect:
      24. connection refused'
      25. reason: TCPConnectError
      26. success: false
      27. time: "2021-01-13T20:10:34Z"
      28. - latency: 2.582129ms
      29. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: failed
      30. to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443: connect:
      31. connection refused'
      32. reason: TCPConnectError
      33. success: false
      34. time: "2021-01-13T20:09:34Z"
      35. - latency: 3.483578ms
      36. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: failed
      37. to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443: connect:
      38. connection refused'
      39. reason: TCPConnectError
      40. success: false
      41. time: "2021-01-13T20:08:34Z"
      42. outages:
      43. - end: "2021-01-13T20:11:34Z"
      44. endLogs:
      45. - latency: 2.032018ms
      46. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0:
      47. tcp connection to 10.0.0.4:6443 succeeded'
      48. reason: TCPConnect
      49. success: true
      50. time: "2021-01-13T20:11:34Z"
      51. - latency: 2.241775ms
      52. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0:
      53. failed to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443:
      54. connect: connection refused'
      55. reason: TCPConnectError
      56. success: false
      57. time: "2021-01-13T20:10:34Z"
      58. - latency: 2.582129ms
      59. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0:
      60. failed to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443:
      61. connect: connection refused'
      62. reason: TCPConnectError
      63. success: false
      64. time: "2021-01-13T20:09:34Z"
      65. - latency: 3.483578ms
      66. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0:
      67. failed to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443:
      68. connect: connection refused'
      69. reason: TCPConnectError
      70. success: false
      71. time: "2021-01-13T20:08:34Z"
      72. message: Connectivity restored after 2m59.999789186s
      73. start: "2021-01-13T20:08:34Z"
      74. startLogs:
      75. - latency: 3.483578ms
      76. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0:
      77. failed to establish a TCP connection to 10.0.0.4:6443: dial tcp 10.0.0.4:6443:
      78. connect: connection refused'
      79. reason: TCPConnectError
      80. success: false
      81. time: "2021-01-13T20:08:34Z"
      82. successes:
      83. - latency: 2.845865ms
      84. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      85. connection to 10.0.0.4:6443 succeeded'
      86. reason: TCPConnect
      87. success: true
      88. time: "2021-01-13T21:14:34Z"
      89. - latency: 2.926345ms
      90. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      91. connection to 10.0.0.4:6443 succeeded'
      92. reason: TCPConnect
      93. success: true
      94. time: "2021-01-13T21:13:34Z"
      95. - latency: 2.895796ms
      96. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      97. connection to 10.0.0.4:6443 succeeded'
      98. reason: TCPConnect
      99. success: true
      100. time: "2021-01-13T21:12:34Z"
      101. - latency: 2.696844ms
      102. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      103. connection to 10.0.0.4:6443 succeeded'
      104. reason: TCPConnect
      105. success: true
      106. time: "2021-01-13T21:11:34Z"
      107. - latency: 1.502064ms
      108. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      109. connection to 10.0.0.4:6443 succeeded'
      110. reason: TCPConnect
      111. success: true
      112. time: "2021-01-13T21:10:34Z"
      113. - latency: 1.388857ms
      114. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      115. connection to 10.0.0.4:6443 succeeded'
      116. reason: TCPConnect
      117. success: true
      118. time: "2021-01-13T21:09:34Z"
      119. - latency: 1.906383ms
      120. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      121. connection to 10.0.0.4:6443 succeeded'
      122. reason: TCPConnect
      123. success: true
      124. time: "2021-01-13T21:08:34Z"
      125. - latency: 2.089073ms
      126. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      127. connection to 10.0.0.4:6443 succeeded'
      128. reason: TCPConnect
      129. success: true
      130. time: "2021-01-13T21:07:34Z"
      131. - latency: 2.156994ms
      132. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      133. connection to 10.0.0.4:6443 succeeded'
      134. reason: TCPConnect
      135. success: true
      136. time: "2021-01-13T21:06:34Z"
      137. - latency: 1.777043ms
      138. message: 'kubernetes-apiserver-endpoint-ci-ln-x5sv9rb-f76d1-4rzrp-master-0: tcp
      139. connection to 10.0.0.4:6443 succeeded'
      140. reason: TCPConnect
      141. success: true
      142. time: "2021-01-13T21:05:34Z"