Environment health checks

This topic contains steps to verify the overall health of the OKD cluster and the various components, as well as describing the intended behavior.

Knowing the verification process for the various components is the first step to troubleshooting issues. If experiencing issues, you can use the checks provided in this section to diagnose any problems.

Checking complete environment health

To verify the end-to-end functionality of an OKD cluster, build and deploy an example application.

Procedure
  1. Create a new project named validate, as well as an example application from the cakephp-mysql-example template:

    1. $ oc new-project validate
    2. $ oc new-app cakephp-mysql-example

    You can check the logs to follow the build:

    1. $ oc logs -f bc/cakephp-mysql-example
  2. Once the build is complete, two pods should be running: a database and an application:

    1. $ oc get pods
    2. NAME READY STATUS RESTARTS AGE
    3. cakephp-mysql-example-1-build 0/1 Completed 0 1m
    4. cakephp-mysql-example-2-247xm 1/1 Running 0 39s
    5. mysql-1-hbk46 1/1 Running 0 1m
  3. Visit the application URL. The Cake PHP framework welcome page should be visible. The URL should have the following format cakephp-mysql-example-validate.<app_domain>.

  4. Once the functionality has been verified, the validate project can be deleted:

    1. $ oc delete project validate

    All resources within the project will be deleted as well.

Creating alerts using Prometheus

You can integrate OKD with Prometheus to create visuals and alerts to help diagnose any environment issues before they arise. These issues can include if a node goes down, if a pod is consuming too much CPU or memory, and more.

See the Prometheus on OpenShift Container Platform section in the Installation and configuration guide for more information.

Host health

To verify that the cluster is up and running, connect to a master instance, and run the following:

  1. $ oc get nodes
  2. NAME STATUS AGE VERSION
  3. ocp-infra-node-1clj Ready 1h v1.6.1+5115d708d7
  4. ocp-infra-node-86qr Ready 1h v1.6.1+5115d708d7
  5. ocp-infra-node-g8qw Ready 1h v1.6.1+5115d708d7
  6. ocp-master-94zd Ready 1h v1.6.1+5115d708d7
  7. ocp-master-gjkm Ready 1h v1.6.1+5115d708d7
  8. ocp-master-wc8w Ready 1h v1.6.1+5115d708d7
  9. ocp-node-c5dg Ready 1h v1.6.1+5115d708d7
  10. ocp-node-ghxn Ready 1h v1.6.1+5115d708d7
  11. ocp-node-w135 Ready 1h v1.6.1+5115d708d7

The above cluster example consists of three master hosts, three infrastructure node hosts, and three node hosts. All of them are running. All hosts in the cluster should be visible in this output.

The Ready status means that master hosts can communicate with node hosts and that the nodes are ready to run pods (excluding the nodes in which scheduling is disabled).

Before you run etcd commands, source the etcd.conf file:

  1. # source /etc/etcd/etcd.conf

You can check the basic etcd health status from any master instance with the etcdctl command:

  1. # etcdctl --cert-file=$ETCD_PEER_CERT_FILE --key-file=$ETCD_PEER_KEY_FILE \
  2. --ca-file=/etc/etcd/ca.crt --endpoints=$ETCD_LISTEN_CLIENT_URLS cluster-health
  3. member 59df5107484b84df is healthy: got healthy result from https://10.156.0.5:2379
  4. member 6df7221a03f65299 is healthy: got healthy result from https://10.156.0.6:2379
  5. member fea6dfedf3eecfa3 is healthy: got healthy result from https://10.156.0.9:2379
  6. cluster is healthy

However, to get more information about etcd hosts, including the associated master host:

  1. # etcdctl --cert-file=$ETCD_PEER_CERT_FILE --key-file=$ETCD_PEER_KEY_FILE \
  2. --ca-file=/etc/etcd/ca.crt --endpoints=$ETCD_LISTEN_CLIENT_URLS member list
  3. 295750b7103123e0: name=ocp-master-zh8d peerURLs=https://10.156.0.7:2380 clientURLs=https://10.156.0.7:2379 isLeader=true
  4. b097a72f2610aea5: name=ocp-master-qcg3 peerURLs=https://10.156.0.11:2380 clientURLs=https://10.156.0.11:2379 isLeader=false
  5. fea6dfedf3eecfa3: name=ocp-master-j338 peerURLs=https://10.156.0.9:2380 clientURLs=https://10.156.0.9:2379 isLeader=false

All etcd hosts should contain the master host name if the etcd cluster is co-located with master services, or all etcd instances should be visible if etcd is running separately.

etcdctl2 is an alias for the etcdctl tool that contains the proper flags to query the etcd cluster in v2 data model, as well as, etcdctl3 for v3 data model.

Router and registry health

To check if a router service is running:

  1. $ oc -n default get deploymentconfigs/router
  2. NAME REVISION DESIRED CURRENT TRIGGERED BY
  3. router 1 3 3 config

The values in the DESIRED and CURRENT columns should match the number of nodes hosts.

Use the same command to check the registry status:

  1. $ oc -n default get deploymentconfigs/docker-registry
  2. NAME REVISION DESIRED CURRENT TRIGGERED BY
  3. docker-registry 1 3 3 config

Multiple running instances of the container image registry require backend storage supporting writes by multiple processes. If the chosen infrastructure provider does not contain this ability, running a single instance of a container image registry is acceptable.

To verify that all pods are running and on which hosts:

  1. $ oc -n default get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE
  3. docker-registry-1-54nhl 1/1 Running 0 2d 172.16.2.3 ocp-infra-node-tl47
  4. docker-registry-1-jsm2t 1/1 Running 0 2d 172.16.8.2 ocp-infra-node-62rc
  5. docker-registry-1-qbt4g 1/1 Running 0 2d 172.16.14.3 ocp-infra-node-xrtz
  6. registry-console-2-gbhcz 1/1 Running 0 2d 172.16.8.4 ocp-infra-node-62rc
  7. router-1-6zhf8 1/1 Running 0 2d 10.156.0.4 ocp-infra-node-62rc
  8. router-1-ffq4g 1/1 Running 0 2d 10.156.0.10 ocp-infra-node-tl47
  9. router-1-zqxbl 1/1 Running 0 2d 10.156.0.8 ocp-infra-node-xrtz

If OKD is using an external container image registry, the internal registry service does not need to be running.

Network connectivity

Network connectivity has two main networking layers: the cluster network for node interaction, and the software defined network (SDN) for pod interaction. OKD supports multiple network configurations, often optimized for a specific infrastructure provider.

Due to the complexity of networking, not all verification scenarios are covered in this section.

Connectivity on master hosts

etcd and master hosts

Master services keep their state synchronized using the etcd key-value store. Communication between master and etcd services is important, whether those etcd services are collocated on master hosts, or running on hosts designated only for the etcd service. This communication happens on TCP ports 2379 and 2380. See the Host health section for methods to check this communication.

SkyDNS

SkyDNS provides name resolution of local services running in OKD. This service uses TCP and UDP port 8053.

To verify the name resolution:

  1. $ dig +short docker-registry.default.svc.cluster.local
  2. 172.30.150.7

If the answer matches the output of the following, SkyDNS service is working correctly:

  1. $ oc get svc/docker-registry -n default
  2. NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. docker-registry 172.30.150.7 <none> 5000/TCP 3d

API service and web console

Both the API service and web console share the same port, usually TCP 8443 or 443, depending on the setup. This port needs to be available within the cluster and to everyone who needs to work with the deployed environment. The URLs under which this port is reachable may differ for internal cluster and for external clients.

In the following example, the [https://internal-master.example.com:443](https://internal-master.example.com:443) URL is used by the internal cluster, and the [https://master.example.com:443](https://master.example.com:443) URL is used by external clients. On any node host:

  1. $ curl -k https://internal-master.example.com:443/version
  2. {
  3. "major": "1",
  4. "minor": "6",
  5. "gitVersion": "v1.6.1+5115d708d7",
  6. "gitCommit": "fff65cf",
  7. "gitTreeState": "clean",
  8. "buildDate": "2017-10-11T22:44:25Z",
  9. "goVersion": "go1.7.6",
  10. "compiler": "gc",
  11. "platform": "linux/amd64"
  12. }

This must be reachable from client’s network:

  1. $ curl -k https://master.example.com:443/healthz
  2. ok

Connectivity on node instances

The SDN connecting pod communication on nodes uses UDP port 4789 by default.

To verify node host functionality, create a new application. The following example ensures the node reaches the container image registry, which is running on an infrastructure node:

Procedure
  1. Create a new project:

    1. $ oc new-project sdn-test
  2. Deploy an httpd application:

    1. $ oc new-app centos/httpd-24-centos7~https://github.com/sclorg/httpd-ex

    Wait until the build is complete:

    1. $ oc get pods
    2. NAME READY STATUS RESTARTS AGE
    3. httpd-ex-1-205hz 1/1 Running 0 34s
    4. httpd-ex-1-build 0/1 Completed 0 1m
  3. Connect to the running pod:

    1. $ oc rsh po/<pod-name>

    For example:

    1. $ oc rsh po/httpd-ex-1-205hz
  4. Check the healthz path of the internal registry service:

    1. $ curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz
    2. * About to connect() to docker-registry.default.svc.cluster.locl port 5000 (#0)
    3. * Trying 172.30.150.7...
    4. * Connected to docker-registry.default.svc.cluster.local (172.30.150.7) port 5000 (#0)
    5. * Initializing NSS with certpath: sql:/etc/pki/nssdb
    6. * skipping SSL peer certificate verification
    7. * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    8. * Server certificate:
    9. * subject: CN=172.30.150.7
    10. * start date: Nov 30 17:21:51 2017 GMT
    11. * expire date: Nov 30 17:21:52 2019 GMT
    12. * common name: 172.30.150.7
    13. * issuer: CN=openshift-signer@1512059618
    14. > GET /healthz HTTP/1.1
    15. > User-Agent: curl/7.29.0
    16. > Host: docker-registry.default.svc.cluster.local:5000
    17. > Accept: */*
    18. >
    19. < HTTP/1.1 200 OK
    20. < Cache-Control: no-cache
    21. < Date: Mon, 04 Dec 2017 16:26:49 GMT
    22. < Content-Length: 0
    23. < Content-Type: text/plain; charset=utf-8
    24. <
    25. * Connection #0 to host docker-registry.default.svc.cluster.local left intact
    26. sh-4.2$ *exit*

    The HTTP/1.1 200 OK response means the node is correctly connecting.

  5. Clean up the test project:

    1. $ oc delete project sdn-test
    2. project "sdn-test" deleted
  6. The node host is listening on TCP port 10250. This port needs to be reachable by all master hosts on any node, and if monitoring is deployed in the cluster, the infrastructure nodes must have access to this port on all instances as well. Broken communication on this port can be detected with the following command:

    1. $ oc get nodes
    2. NAME STATUS AGE VERSION
    3. ocp-infra-node-1clj Ready 4d v1.6.1+5115d708d7
    4. ocp-infra-node-86qr Ready 4d v1.6.1+5115d708d7
    5. ocp-infra-node-g8qw Ready 4d v1.6.1+5115d708d7
    6. ocp-master-94zd Ready,SchedulingDisabled 4d v1.6.1+5115d708d7
    7. ocp-master-gjkm Ready,SchedulingDisabled 4d v1.6.1+5115d708d7
    8. ocp-master-wc8w Ready,SchedulingDisabled 4d v1.6.1+5115d708d7
    9. ocp-node-c5dg Ready 4d v1.6.1+5115d708d7
    10. ocp-node-ghxn Ready 4d v1.6.1+5115d708d7
    11. ocp-node-w135 NotReady 4d v1.6.1+5115d708d7

    In the output above, the node service on the ocp-node-w135 node is not reachable by the master services, which is represented by its NotReady status.

  7. The last service is the router, which is responsible for routing connections to the correct services running in the OKD cluster. Routers listen on TCP ports 80 and 443 on infrastructure nodes for ingress traffic. Before routers can start working, DNS must be configured:

    1. $ dig *.apps.example.com
    2. ; <<>> DiG 9.11.1-P3-RedHat-9.11.1-8.P3.fc27 <<>> *.apps.example.com
    3. ;; global options: +cmd
    4. ;; Got answer:
    5. ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45790
    6. ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
    7. ;; OPT PSEUDOSECTION:
    8. ; EDNS: version: 0, flags:; udp: 4096
    9. ;; QUESTION SECTION:
    10. ;*.apps.example.com. IN A
    11. ;; ANSWER SECTION:
    12. *.apps.example.com. 3571 IN CNAME apps.example.com.
    13. apps.example.com. 3561 IN A 35.xx.xx.92
    14. ;; Query time: 0 msec
    15. ;; SERVER: 127.0.0.1#53(127.0.0.1)
    16. ;; WHEN: Tue Dec 05 16:03:52 CET 2017
    17. ;; MSG SIZE rcvd: 105

    The IP address, in this case 35.xx.xx.92, should be pointing to the load balancer distributing ingress traffic to all infrastructure nodes. To verify the functionality of the routers, check the registry service once more, but this time from outside the cluster:

    1. $ curl -kv https://docker-registry-default.apps.example.com/healthz
    2. * Trying 35.xx.xx.92...
    3. * TCP_NODELAY set
    4. * Connected to docker-registry-default.apps.example.com (35.xx.xx.92) port 443 (#0)
    5. ...
    6. < HTTP/2 200
    7. < cache-control: no-cache
    8. < content-type: text/plain; charset=utf-8
    9. < content-length: 0
    10. < date: Tue, 05 Dec 2017 15:13:27 GMT
    11. <
    12. * Connection #0 to host docker-registry-default.apps.example.com left intact

Storage

Master instances need at least 40 GB of hard disk space for the /var directory. Check the disk usage of a master host using the df command:

  1. $ df -hT
  2. Filesystem Type Size Used Avail Use% Mounted on
  3. /dev/sda1 xfs 45G 2.8G 43G 7% /
  4. devtmpfs devtmpfs 3.6G 0 3.6G 0% /dev
  5. tmpfs tmpfs 3.6G 0 3.6G 0% /dev/shm
  6. tmpfs tmpfs 3.6G 63M 3.6G 2% /run
  7. tmpfs tmpfs 3.6G 0 3.6G 0% /sys/fs/cgroup
  8. tmpfs tmpfs 732M 0 732M 0% /run/user/1000
  9. tmpfs tmpfs 732M 0 732M 0% /run/user/0

Node instances need at least 15 GB space for the /var directory, and at least another 15 GB for Docker storage (/var/lib/docker in this case). Depending on the size of the cluster and the amount of ephemeral storage desired for pods, a separate partition should be created for /var/lib/origin/openshift.local.volumes on the nodes.

  1. $ df -hT
  2. Filesystem Type Size Used Avail Use% Mounted on
  3. /dev/sda1 xfs 25G 2.4G 23G 10% /
  4. devtmpfs devtmpfs 3.6G 0 3.6G 0% /dev
  5. tmpfs tmpfs 3.6G 0 3.6G 0% /dev/shm
  6. tmpfs tmpfs 3.6G 147M 3.5G 4% /run
  7. tmpfs tmpfs 3.6G 0 3.6G 0% /sys/fs/cgroup
  8. /dev/sdb xfs 25G 2.7G 23G 11% /var/lib/docker
  9. /dev/sdc xfs 50G 33M 50G 1% /var/lib/origin/openshift.local.volumes
  10. tmpfs tmpfs 732M 0 732M 0% /run/user/1000

Persistent storage for pods should be handled outside of the instances running the OKD cluster. Persistent volumes for pods can be provisioned by the infrastructure provider, or with the use of container native storage or container ready storage.

Docker storage

Docker Storage can be backed by one of two options. The first is a thin pool logical volume with device mapper, the second, since Red Hat Enterprise Linux version 7.4, is an overlay2 file system. The overlay2 file system is generally recommended due to the ease of setup and increased performance.

The Docker storage disk is mounted as /var/lib/docker and formatted with xfs file system. Docker storage is configured to use overlay2 filesystem:

  1. $ cat /etc/sysconfig/docker-storage
  2. DOCKER_STORAGE_OPTIONS='--storage-driver overlay2'

To verify this storage driver is used by Docker:

  1. # docker info
  2. Containers: 4
  3. Running: 4
  4. Paused: 0
  5. Stopped: 0
  6. Images: 4
  7. Server Version: 1.12.6
  8. Storage Driver: overlay2
  9. Backing Filesystem: xfs
  10. Logging Driver: journald
  11. Cgroup Driver: systemd
  12. Plugins:
  13. Volume: local
  14. Network: overlay host bridge null
  15. Authorization: rhel-push-plugin
  16. Swarm: inactive
  17. Runtimes: docker-runc runc
  18. Default Runtime: docker-runc
  19. Security Options: seccomp selinux
  20. Kernel Version: 3.10.0-693.11.1.el7.x86_64
  21. Operating System: Employee SKU
  22. OSType: linux
  23. Architecture: x86_64
  24. Number of Docker Hooks: 3
  25. CPUs: 2
  26. Total Memory: 7.147 GiB
  27. Name: ocp-infra-node-1clj
  28. ID: T7T6:IQTG:WTUX:7BRU:5FI4:XUL5:PAAM:4SLW:NWKL:WU2V:NQOW:JPHC
  29. Docker Root Dir: /var/lib/docker
  30. Debug Mode (client): false
  31. Debug Mode (server): false
  32. Registry: https://registry.redhat.io/v1/
  33. WARNING: bridge-nf-call-iptables is disabled
  34. WARNING: bridge-nf-call-ip6tables is disabled
  35. Insecure Registries:
  36. 127.0.0.0/8
  37. Registries: registry.redhat.io (secure), registry.redhat.io (secure), docker.io (secure)

API service status

The OpenShift API service runs on all master instances. To see the status of the service, view the master-api pods in the kube-system project:

  1. oc get pod -n kube-system -l openshift.io/component=api
  2. NAME READY STATUS RESTARTS AGE
  3. master-api-myserver.com 1/1 Running 0 56d

The API service exposes a health check, which can be queried externally using the API host name. Both the API service and web console share the same port, usually TCP 8443 or 443, depending on the setup. This port needs to be available within the cluster and to everyone who needs to work with the deployed environment:

  1. oc get pod -n kube-system -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE
  3. master-api-myserver.com 1/1 Running 0 7h 10.240.0.16 myserver.com
  4. $ curl -k https://myserver.com:443/healthz (1)
  5. ok
1This must be reachable from the client’s network. The web console port in this example is 443. Specify the value set for openshift_master_console_port in the host inventory file prior to OKD deployment. If openshift_master_console_port is not included in the inventory file, port 8443 is set by default.

Controller role verification

The OKD controller service, is available across all master hosts. The service runs in active/passive mode, meaning it should only be running on one master at any time.

The OKD controllers execute a procedure to choose which host runs the service. The current running value is stored in an annotation in a special configmap stored in the kube-system project.

Verify the master host running the controller service as a cluster-admin user:

  1. $ oc get -n kube-system cm openshift-master-controllers -o yaml
  2. apiVersion: v1
  3. kind: ConfigMap
  4. metadata:
  5. annotations:
  6. control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master-ose-master-0.example.com-10.19.115.212-dnwrtcl4","leaseDurationSeconds":15,"acquireTime":"2018-02-17T18:16:54Z","renewTime":"2018-02-19T13:50:33Z","leaderTransitions":16}'
  7. creationTimestamp: 2018-02-02T10:30:04Z
  8. name: openshift-master-controllers
  9. namespace: kube-system
  10. resourceVersion: "17349662"
  11. selfLink: /api/v1/namespaces/kube-system/configmaps/openshift-master-controllers
  12. uid: 08636843-0804-11e8-8580-fa163eb934f0

The command outputs the current master controller in the control-plane.alpha.kubernetes.io/leader annotation, within the holderIdentity property as:

  1. master-<hostname>-<ip>-<8_random_characters>

Find the hostname of the master host by filtering the output using the following:

  1. $ oc get -n kube-system cm openshift-master-controllers -o json | jq -r '.metadata.annotations[] | fromjson.holderIdentity | match("^master-(.*)-[0-9.]*-[0-9a-z]{8}$") | .captures[0].string'
  2. ose-master-0.example.com

Verifying correct Maximum Transmission Unit (MTU) size

Verifying the maximum transmission unit (MTU) prevents a possible networking misconfiguration that can masquerade as an SSL certificate issue.

When a packet is larger than the MTU size that is transmitted over HTTP, the physical network router is able to break the packet into multiple packets to transmit the data. However, when a packet is larger than the MTU size is that transmitted over HTTPS, the router is forced to drop the packet.

Installation produces certificates that provide secure connections to multiple components that include:

  • master hosts

  • node hosts

  • infrastructure nodes

  • registry

  • router

These certificates can be found within the /etc/origin/master directory for the master nodes and /etc/origin/node directory for the infra and app nodes.

After installation, you can verify connectivity to the REGISTRY_OPENSHIFT_SERVER_ADDR using the process outlined in the Network connectivity section.

Prerequisites
  1. From a master host, get the HTTPS address:

    1. $ oc -n default get dc docker-registry -o jsonpath='{.spec.template.spec.containers[].env[?(@.name=="REGISTRY_OPENSHIFT_SERVER_ADDR")].value}{"\n"}'
    2. docker-registry.default.svc:5000

    The above gives the output of docker-registry.default.svc:5000.

  2. Append /healthz to the value given above, use it to check on all hosts (master, infrastructure, node):

    1. $ curl -v https://docker-registry.default.svc:5000/healthz
    2. * About to connect() to docker-registry.default.svc port 5000 (#0)
    3. * Trying 172.30.11.171...
    4. * Connected to docker-registry.default.svc (172.30.11.171) port 5000 (#0)
    5. * Initializing NSS with certpath: sql:/etc/pki/nssdb
    6. * CAfile: /etc/pki/tls/certs/ca-bundle.crt
    7. CApath: none
    8. * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    9. * Server certificate:
    10. * subject: CN=172.30.11.171
    11. * start date: Oct 18 05:30:10 2017 GMT
    12. * expire date: Oct 18 05:30:11 2019 GMT
    13. * common name: 172.30.11.171
    14. * issuer: CN=openshift-signer@1508303629
    15. > GET /healthz HTTP/1.1
    16. > User-Agent: curl/7.29.0
    17. > Host: docker-registry.default.svc:5000
    18. > Accept: */*
    19. >
    20. < HTTP/1.1 200 OK
    21. < Cache-Control: no-cache
    22. < Date: Tue, 24 Oct 2017 19:42:35 GMT
    23. < Content-Length: 0
    24. < Content-Type: text/plain; charset=utf-8
    25. <
    26. * Connection #0 to host docker-registry.default.svc left intact

    The above example output shows the MTU size being used to ensure the SSL connection is correct. The attempt to connect is successful, followed by connectivity being established and completes with initializing the NSS with the certpath and all the server certificate information regarding the docker-registry.

    An improper MTU size results in a timeout:

    1. $ curl -v https://docker-registry.default.svc:5000/healthz
    2. * About to connect() to docker-registry.default.svc port 5000 (#0)
    3. * Trying 172.30.11.171...
    4. * Connected to docker-registry.default.svc (172.30.11.171) port 5000 (#0)
    5. * Initializing NSS with certpath: sql:/etc/pki/nssdb

    The above example shows that the connection is established, but it cannot finish initializing NSS with certpath. The issue deals with improper MTU size set within the appropriate node configuration map.

    To fix this issue, adjust the MTU size within the node configuration map to 50 bytes smaller than the MTU size that the OpenShift SDN Ethernet device uses.

  3. View the MTU size of the desired Ethernet device (i.e. eth0):

    1. $ ip link show eth0
    2. 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    3. link/ether fa:16:3e:92:6a:86 brd ff:ff:ff:ff:ff:ff

    The above shows MTU set to 1500.

  4. To change the MTU size, modify the appropriate node configuration map and set a value that is 50 bytes smaller than output provided by the ip command.

    For example, if the MTU size is set to 1500, adjust the MTU size to 1450 within the node configuraton map:

    1. networkConfig:
    2. mtu: 1450
  5. Save the changes and reboot the node:

    You must change the MTU size on all masters and nodes that are part of the OKD SDN. Also, the MTU size of the tun0 interface must be the same across all nodes that are part of the cluster.

  6. Once the node is back online, confirm the issue no longer exists by re-running the original curl command.

    1. $ curl -v https://docker-registry.default.svc:5000/healthz

    If the timeout persists, continue to adjust the MTU size in increments of 50 bytes and repeat the process.