This document provides prescriptive guidance for hardening a production installation of a RKE cluster to be used with Rancher v2.5.4. It outlines the configurations and controls required to address Kubernetes benchmark controls from the Center for Information Security (CIS).

This hardening guide describes how to secure the nodes in your cluster, and it is recommended to follow this guide before installing Kubernetes.

This hardening guide is intended to be used for RKE clusters and associated with specific versions of the CIS Kubernetes Benchmark, Kubernetes, and Rancher:

Rancher VersionCIS Benchmark VersionKubernetes Version
Rancher v2.5.4Benchmark 1.6Kubernetes v1.18

Click here to download a PDF version of this document

Overview

This document provides prescriptive guidance for hardening a RKE cluster to be used for installing Rancher v2.5.4 with Kubernetes v1.18 or provisioning a RKE cluster with Kubernetes v1.18 to be used within Rancher v2.5.4. It outlines the configurations required to address Kubernetes benchmark controls from the Center for Information Security (CIS).

For more detail about evaluating a hardened cluster against the official CIS benchmark, refer to the CIS 1.6 Benchmark - Self-Assessment Guide - Rancher v2.5.4.

Known Issues

  • Rancher exec shell and view logs for pods are not functional in a CIS 1.6 hardened setup when only public IP is provided when registering custom nodes. This functionality requires a private IP to be provided when registering the custom nodes.
  • When setting the default_pod_security_policy_template_id: to restricted Rancher creates RoleBindings and ClusterRoleBindings on the default service accounts. The CIS 1.6 5.1.5 check requires the default service accounts have no roles or cluster roles bound to it apart from the defaults. In addition the default service accounts should be configured such that it does not provide a service account token and does not have any explicit rights assignments.

  • Migration Rancher from 2.4 to 2.5. Addons were removed in HG 2.5, and therefore namespaces on migration may be not created on the downstream clusters. Pod may fail to run because of missing namesapce like ingress-nginx, cattlae-system.

Configure Kernel Runtime Parameters

The following sysctl configuration is recommended for all nodes type in the cluster. Set the following parameters in /etc/sysctl.d/90-kubelet.conf:

  1. vm.overcommit_memory=1
  2. vm.panic_on_oom=0
  3. kernel.panic=10
  4. kernel.panic_on_oops=1
  5. kernel.keys.root_maxbytes=25000000

Run sysctl -p /etc/sysctl.d/90-kubelet.conf to enable the settings.

Configure etcd user and group

A user account and group for the etcd service is required to be setup before installing RKE. The uid and gid for the etcd user will be used in the RKE config.yml to set the proper permissions for files and directories during installation time.

create etcd user and group

To create the etcd group run the following console commands.

The commands below use 52034 for uid and gid are for example purposes. Any valid unused uid or gid could also be used in lieu of 52034.

  1. groupadd --gid 52034 etcd
  2. useradd --comment "etcd service account" --uid 52034 --gid 52034 etcd

Update the RKE config.yml with the uid and gid of the etcd user:

  1. services:
  2. etcd:
  3. gid: 52034
  4. uid: 52034

Set automountServiceAccountToken to false for default service accounts

Kubernetes provides a default service account which is used by cluster workloads where no specific service account is assigned to the pod. Where access to the Kubernetes API from a pod is required, a specific service account should be created for that pod, and rights granted to that service account. The default service account should be configured such that it does not provide a service account token and does not have any explicit rights assignments.

For each namespace including default and kube-system on a standard RKE install the default service account must include this value:

  1. automountServiceAccountToken: false

Save the following yaml to a file called account_update.yaml

  1. apiVersion: v1
  2. kind: ServiceAccount
  3. metadata:
  4. name: default
  5. automountServiceAccountToken: false

Create a bash script file called account_update.sh. Be sure to chmod +x account_update.sh so the script has execute permissions.

  1. #!/bin/bash -e
  2. for namespace in $(kubectl get namespaces -A -o json | jq -r '.items[].metadata.name'); do
  3. kubectl patch serviceaccount default -n ${namespace} -p "$(cat account_update.yaml)"
  4. done

Ensure that all Namespaces have Network Policies defined

Running different applications on the same Kubernetes cluster creates a risk of one compromised application attacking a neighboring application. Network segmentation is important to ensure that containers can communicate only with those they are supposed to. A network policy is a specification of how selections of pods are allowed to communicate with each other and other network endpoints.

Network Policies are namespace scoped. When a network policy is introduced to a given namespace, all traffic not allowed by the policy is denied. However, if there are no network policies in a namespace all traffic will be allowed into and out of the pods in that namespace. To enforce network policies, a CNI (container network interface) plugin must be enabled. This guide uses canal to provide the policy enforcement. Additional information about CNI providers can be found here

Once a CNI provider is enabled on a cluster a default network policy can be applied. For reference purposes a permissive example is provide below. If you want to allow all traffic to all pods in a namespace (even if policies are added that cause some pods to be treated as “isolated”), you can create a policy that explicitly allows all traffic in that namespace. Save the following yaml as default-allow-all.yaml. Additional documentation about network policies can be found on the Kubernetes site.

This NetworkPolicy is not recommended for production use

  1. ---
  2. apiVersion: networking.k8s.io/v1
  3. kind: NetworkPolicy
  4. metadata:
  5. name: default-allow-all
  6. spec:
  7. podSelector: {}
  8. ingress:
  9. - {}
  10. egress:
  11. - {}
  12. policyTypes:
  13. - Ingress
  14. - Egress

Create a bash script file called apply_networkPolicy_to_all_ns.sh. Be sure to chmod +x apply_networkPolicy_to_all_ns.sh so the script has execute permissions.

  1. #!/bin/bash -e
  2. for namespace in $(kubectl get namespaces -A -o json | jq -r '.items[].metadata.name'); do
  3. kubectl apply -f default-allow-all.yaml -n ${namespace}
  4. done

Execute this script to apply the default-allow-all.yaml the permissive NetworkPolicy to all namespaces.

Reference Hardened RKE cluster.yml configuration

The reference cluster.yml is used by the RKE CLI that provides the configuration needed to achieve a hardened install of Rancher Kubernetes Engine (RKE). Install documentation is provided with additional details about the configuration items. This reference cluster.yml does not include the required nodes directive which will vary depending on your environment. Documentation for node configuration can be found here: https://rancher.com/docs/rke/latest/en/config-options/nodes

  1. # If you intend to deploy Kubernetes in an air-gapped environment,
  2. # please consult the documentation on how to configure custom RKE images.
  3. # https://rancher.com/docs/rke/latest/en/installation/
  4. # the nodes directive is required and will vary depending on your environment
  5. # documentation for node configuration can be found here:
  6. # https://rancher.com/docs/rke/latest/en/config-options/nodes
  7. nodes: []
  8. services:
  9. etcd:
  10. image: ""
  11. extra_args: {}
  12. extra_binds: []
  13. extra_env: []
  14. win_extra_args: {}
  15. win_extra_binds: []
  16. win_extra_env: []
  17. external_urls: []
  18. ca_cert: ""
  19. cert: ""
  20. key: ""
  21. path: ""
  22. uid: 52034
  23. gid: 52034
  24. snapshot: false
  25. retention: ""
  26. creation: ""
  27. backup_config: null
  28. kube-api:
  29. image: ""
  30. extra_args: {}
  31. extra_binds: []
  32. extra_env: []
  33. win_extra_args: {}
  34. win_extra_binds: []
  35. win_extra_env: []
  36. service_cluster_ip_range: ""
  37. service_node_port_range: ""
  38. pod_security_policy: true
  39. always_pull_images: false
  40. secrets_encryption_config:
  41. enabled: true
  42. custom_config: null
  43. audit_log:
  44. enabled: true
  45. configuration: null
  46. admission_configuration: null
  47. event_rate_limit:
  48. enabled: true
  49. configuration: null
  50. kube-controller:
  51. image: ""
  52. extra_args:
  53. feature-gates: RotateKubeletServerCertificate=true
  54. extra_binds: []
  55. extra_env: []
  56. win_extra_args: {}
  57. win_extra_binds: []
  58. win_extra_env: []
  59. cluster_cidr: ""
  60. service_cluster_ip_range: ""
  61. scheduler:
  62. image: ""
  63. extra_args: {}
  64. extra_binds: []
  65. extra_env: []
  66. win_extra_args: {}
  67. win_extra_binds: []
  68. win_extra_env: []
  69. kubelet:
  70. image: ""
  71. extra_args:
  72. feature-gates: RotateKubeletServerCertificate=true
  73. protect-kernel-defaults: "true"
  74. tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
  75. extra_binds: []
  76. extra_env: []
  77. win_extra_args: {}
  78. win_extra_binds: []
  79. win_extra_env: []
  80. cluster_domain: cluster.local
  81. infra_container_image: ""
  82. cluster_dns_server: ""
  83. fail_swap_on: false
  84. generate_serving_certificate: true
  85. kubeproxy:
  86. image: ""
  87. extra_args: {}
  88. extra_binds: []
  89. extra_env: []
  90. win_extra_args: {}
  91. win_extra_binds: []
  92. win_extra_env: []
  93. network:
  94. plugin: ""
  95. options: {}
  96. mtu: 0
  97. node_selector: {}
  98. update_strategy: null
  99. authentication:
  100. strategy: ""
  101. sans: []
  102. webhook: null
  103. addons: |
  104. apiVersion: policy/v1beta1
  105. kind: PodSecurityPolicy
  106. metadata:
  107. name: restricted
  108. spec:
  109. requiredDropCapabilities:
  110. - NET_RAW
  111. privileged: false
  112. allowPrivilegeEscalation: false
  113. defaultAllowPrivilegeEscalation: false
  114. fsGroup:
  115. rule: RunAsAny
  116. runAsUser:
  117. rule: MustRunAsNonRoot
  118. seLinux:
  119. rule: RunAsAny
  120. supplementalGroups:
  121. rule: RunAsAny
  122. volumes:
  123. - emptyDir
  124. - secret
  125. - persistentVolumeClaim
  126. - downwardAPI
  127. - configMap
  128. - projected
  129. ---
  130. apiVersion: rbac.authorization.k8s.io/v1
  131. kind: ClusterRole
  132. metadata:
  133. name: psp:restricted
  134. rules:
  135. - apiGroups:
  136. - extensions
  137. resourceNames:
  138. - restricted
  139. resources:
  140. - podsecuritypolicies
  141. verbs:
  142. - use
  143. ---
  144. apiVersion: rbac.authorization.k8s.io/v1
  145. kind: ClusterRoleBinding
  146. metadata:
  147. name: psp:restricted
  148. roleRef:
  149. apiGroup: rbac.authorization.k8s.io
  150. kind: ClusterRole
  151. name: psp:restricted
  152. subjects:
  153. - apiGroup: rbac.authorization.k8s.io
  154. kind: Group
  155. name: system:serviceaccounts
  156. - apiGroup: rbac.authorization.k8s.io
  157. kind: Group
  158. name: system:authenticated
  159. ---
  160. apiVersion: networking.k8s.io/v1
  161. kind: NetworkPolicy
  162. metadata:
  163. name: default-allow-all
  164. spec:
  165. podSelector: {}
  166. ingress:
  167. - {}
  168. egress:
  169. - {}
  170. policyTypes:
  171. - Ingress
  172. - Egress
  173. ---
  174. apiVersion: v1
  175. kind: ServiceAccount
  176. metadata:
  177. name: default
  178. automountServiceAccountToken: false
  179. addons_include: []
  180. system_images:
  181. etcd: ""
  182. alpine: ""
  183. nginx_proxy: ""
  184. cert_downloader: ""
  185. kubernetes_services_sidecar: ""
  186. kubedns: ""
  187. dnsmasq: ""
  188. kubedns_sidecar: ""
  189. kubedns_autoscaler: ""
  190. coredns: ""
  191. coredns_autoscaler: ""
  192. nodelocal: ""
  193. kubernetes: ""
  194. flannel: ""
  195. flannel_cni: ""
  196. calico_node: ""
  197. calico_cni: ""
  198. calico_controllers: ""
  199. calico_ctl: ""
  200. calico_flexvol: ""
  201. canal_node: ""
  202. canal_cni: ""
  203. canal_controllers: ""
  204. canal_flannel: ""
  205. canal_flexvol: ""
  206. weave_node: ""
  207. weave_cni: ""
  208. pod_infra_container: ""
  209. ingress: ""
  210. ingress_backend: ""
  211. metrics_server: ""
  212. windows_pod_infra_container: ""
  213. ssh_key_path: ""
  214. ssh_cert_path: ""
  215. ssh_agent_auth: false
  216. authorization:
  217. mode: ""
  218. options: {}
  219. ignore_docker_version: false
  220. kubernetes_version: v1.18.12-rancher1-1
  221. private_registries: []
  222. ingress:
  223. provider: ""
  224. options: {}
  225. node_selector: {}
  226. extra_args: {}
  227. dns_policy: ""
  228. extra_envs: []
  229. extra_volumes: []
  230. extra_volume_mounts: []
  231. update_strategy: null
  232. http_port: 0
  233. https_port: 0
  234. network_mode: ""
  235. cluster_name:
  236. cloud_provider:
  237. name: ""
  238. prefix_path: ""
  239. win_prefix_path: ""
  240. addon_job_timeout: 0
  241. bastion_host:
  242. address: ""
  243. port: ""
  244. user: ""
  245. ssh_key: ""
  246. ssh_key_path: ""
  247. ssh_cert: ""
  248. ssh_cert_path: ""
  249. monitoring:
  250. provider: ""
  251. options: {}
  252. node_selector: {}
  253. update_strategy: null
  254. replicas: null
  255. restore:
  256. restore: false
  257. snapshot_name: ""
  258. dns: null
  259. upgrade_strategy:
  260. max_unavailable_worker: ""
  261. max_unavailable_controlplane: ""
  262. drain: null
  263. node_drain_input: null

Reference Hardened RKE Template configuration

The reference RKE Template provides the configuration needed to achieve a hardened install of Kubenetes. RKE Templates are used to provision Kubernetes and define Rancher settings. Follow the Rancher documentaion for additional installation and RKE Template details.

  1. #
  2. # Cluster Config
  3. #
  4. default_pod_security_policy_template_id: restricted
  5. docker_root_dir: /var/lib/docker
  6. enable_cluster_alerting: false
  7. enable_cluster_monitoring: false
  8. enable_network_policy: true
  9. #
  10. # Rancher Config
  11. #
  12. rancher_kubernetes_engine_config:
  13. addon_job_timeout: 45
  14. ignore_docker_version: true
  15. kubernetes_version: v1.18.12-rancher1-1
  16. #
  17. # If you are using calico on AWS
  18. #
  19. # network:
  20. # plugin: calico
  21. # calico_network_provider:
  22. # cloud_provider: aws
  23. #
  24. # # To specify flannel interface
  25. #
  26. # network:
  27. # plugin: flannel
  28. # flannel_network_provider:
  29. # iface: eth1
  30. #
  31. # # To specify flannel interface for canal plugin
  32. #
  33. # network:
  34. # plugin: canal
  35. # canal_network_provider:
  36. # iface: eth1
  37. #
  38. network:
  39. mtu: 0
  40. plugin: canal
  41. rotate_encryption_key: false
  42. #
  43. # services:
  44. # kube-api:
  45. # service_cluster_ip_range: 10.43.0.0/16
  46. # kube-controller:
  47. # cluster_cidr: 10.42.0.0/16
  48. # service_cluster_ip_range: 10.43.0.0/16
  49. # kubelet:
  50. # cluster_domain: cluster.local
  51. # cluster_dns_server: 10.43.0.10
  52. #
  53. services:
  54. etcd:
  55. backup_config:
  56. enabled: false
  57. interval_hours: 12
  58. retention: 6
  59. safe_timestamp: false
  60. creation: 12h
  61. extra_args:
  62. election-timeout: '5000'
  63. heartbeat-interval: '500'
  64. gid: 52034
  65. retention: 72h
  66. snapshot: false
  67. uid: 52034
  68. kube_api:
  69. always_pull_images: false
  70. audit_log:
  71. enabled: true
  72. event_rate_limit:
  73. enabled: true
  74. pod_security_policy: true
  75. secrets_encryption_config:
  76. enabled: true
  77. service_node_port_range: 30000-32767
  78. kube_controller:
  79. extra_args:
  80. feature-gates: RotateKubeletServerCertificate=true
  81. kubelet:
  82. extra_args:
  83. feature-gates: RotateKubeletServerCertificate=true
  84. protect-kernel-defaults: 'true'
  85. tls-cipher-suites: >-
  86. TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
  87. fail_swap_on: false
  88. generate_serving_certificate: true
  89. ssh_agent_auth: false
  90. upgrade_strategy:
  91. max_unavailable_controlplane: '1'
  92. max_unavailable_worker: 10%
  93. windows_prefered_cluster: false

Hardened Reference Ubuntu 20.04 LTS cloud-config:

The reference cloud-config is generally used in cloud infrastructure environments to allow for configuration management of compute instances. The reference config configures Ubuntu operating system level settings needed before installing kubernetes.

  1. #cloud-config
  2. apt:
  3. sources:
  4. docker.list:
  5. source: deb [arch=amd64] http://download.docker.com/linux/ubuntu $RELEASE stable
  6. keyid: 9DC858229FC7DD38854AE2D88D81803C0EBFCD88
  7. system_info:
  8. default_user:
  9. groups:
  10. - docker
  11. write_files:
  12. - path: "/etc/apt/preferences.d/docker"
  13. owner: root:root
  14. permissions: '0600'
  15. content: |
  16. Package: docker-ce
  17. Pin: version 5:19*
  18. Pin-Priority: 800
  19. - path: "/etc/sysctl.d/90-kubelet.conf"
  20. owner: root:root
  21. permissions: '0644'
  22. content: |
  23. vm.overcommit_memory=1
  24. vm.panic_on_oom=0
  25. kernel.panic=10
  26. kernel.panic_on_oops=1
  27. kernel.keys.root_maxbytes=25000000
  28. package_update: true
  29. packages:
  30. - docker-ce
  31. - docker-ce-cli
  32. - containerd.io
  33. runcmd:
  34. - sysctl -p /etc/sysctl.d/90-kubelet.conf
  35. - groupadd --gid 52034 etcd
  36. - useradd --comment "etcd service account" --uid 52034 --gid 52034 etcd