Understanding and managing pod security admission

Pod security admission is an implementation of the Kubernetes pod security standards. Use pod security admission to restrict the behavior of pods.

About pod security admission

OKD includes Kubernetes pod security admission. Pods that do not comply with the pod security admission defined globally or at the namespace level are not admitted to the cluster and cannot run.

Globally, the privileged profile is enforced, and the restricted profile is used for warnings and audits.

You can also configure the pod security admission settings at the namespace level.

Do not run workloads in or share access to default projects. Default projects are reserved for running core cluster components.

The following default projects are considered highly privileged: default, kube-public, kube-system, openshift, openshift-infra, openshift-node, and other system-created projects that have the openshift.io/run-level label set to 0 or 1. Functionality that relies on admission plugins, such as pod security admission, security context constraints, cluster resource quotas, and image reference resolution, does not work in highly privileged projects.

Pod security admission modes

You can configure the following pod security admission modes for a namespace:

Table 1. Pod security admission modes
ModeLabelDescription

enforce

pod-security.kubernetes.io/enforce

Rejects a pod from admission if it does not comply with the set profile

audit

pod-security.kubernetes.io/audit

Logs audit events if a pod does not comply with the set profile

warn

pod-security.kubernetes.io/warn

Displays warnings if a pod does not comply with the set profile

Pod security admission profiles

You can set each of the pod security admission modes to one of the following profiles:

Table 2. Pod security admission profiles
ProfileDescription

privileged

Least restrictive policy; allows for known privilege escalation

baseline

Minimally restrictive policy; prevents known privilege escalations

restricted

Most restrictive policy; follows current pod hardening best practices

Privileged namespaces

The following system namespaces are always set to the privileged pod security admission profile:

  • default

  • kube-public

  • kube-system

You cannot change the pod security profile for these privileged namespaces.

About pod security admission synchronization

In addition to the global pod security admission control configuration, a controller applies pod security admission control warn and audit labels to namespaces according to the SCC permissions of the service accounts that are in a given namespace.

The controller examines ServiceAccount object permissions to use security context constraints in each namespace. Security context constraints (SCCs) are mapped to pod security profiles based on their field values; the controller uses these translated profiles. Pod security admission warn and audit labels are set to the most privileged pod security profile in the namespace to prevent displaying warnings and logging audit events when pods are created.

Namespace labeling is based on consideration of namespace-local service account privileges.

Applying pods directly might use the SCC privileges of the user who runs the pod. However, user privileges are not considered during automatic labeling.

Pod security admission synchronization namespace exclusions

Pod security admission synchronization is permanently disabled on most system-created namespaces. Synchronization is also initially disabled on user-created openshift-* prefixed namespaces, but you can enable synchronization on them later.

If a pod security admission label (pod-security.kubernetes.io/<mode>) is manually modified from the automatically labeled value on a label-synchronized namespace, synchronization is disabled for that label.

If necessary, you can enable synchronization again by using one of the following methods:

  • By removing the modified pod security admission label from the namespace

  • By setting the security.openshift.io/scc.podSecurityLabelSync label to true

    If you force synchronization by adding this label, then any modified pod security admission labels will be overwritten.

Permanently disabled namespaces

Namespaces that are defined as part of the cluster payload have pod security admission synchronization disabled permanently. The following namespaces are permanently disabled:

  • default

  • kube-node-lease

  • kube-system

  • kube-public

  • openshift

  • All system-created namespaces that are prefixed with openshift- , except for openshift-operators

Initially disabled namespaces

By default, all namespaces that have an openshift- prefix have pod security admission synchronization disabled initially. You can enable synchronization for user-created openshift-* namespaces and for the openshift-operators namespace.

You cannot enable synchronization for any system-created openshift-* namespaces, except for openshift-operators.

If an Operator is installed in a user-created openshift-* namespace, synchronization is enabled automatically after a cluster service version (CSV) is created in the namespace. The synchronized label is derived from the permissions of the service accounts in the namespace.

Controlling pod security admission synchronization

You can enable or disable automatic pod security admission synchronization for most namespaces.

You cannot enable pod security admission synchronization on some system-created namespaces. For more information, see Pod security admission synchronization namespace exclusions.

Procedure

  • For each namespace that you want to configure, set a value for the security.openshift.io/scc.podSecurityLabelSync label:

    • To disable pod security admission label synchronization in a namespace, set the value of the security.openshift.io/scc.podSecurityLabelSync label to false.

      Run the following command:

      1. $ oc label namespace <namespace> security.openshift.io/scc.podSecurityLabelSync=false
    • To enable pod security admission label synchronization in a namespace, set the value of the security.openshift.io/scc.podSecurityLabelSync label to true.

      Run the following command:

      1. $ oc label namespace <namespace> security.openshift.io/scc.podSecurityLabelSync=true

Additional resources

Configuring pod security admission for a namespace

You can configure the pod security admission settings at the namespace level. For each of the pod security admission modes on the namespace, you can set which pod security admission profile to use.

Procedure

  • For each pod security admission mode that you want to set on a namespace, run the following command:

    1. $ oc label namespace <namespace> \ (1)
    2. pod-security.kubernetes.io/<mode>=<profile> \ (2)
    3. --overwrite
    1Set <namespace> to the namespace to configure.
    2Set <mode> to enforce, warn, or audit. Set <profile> to restricted, baseline, or privileged.

About pod security admission alerts

A PodSecurityViolation alert is triggered when the Kubernetes API server reports that there is a pod denial on the audit level of the pod security admission controller. This alert persists for one day.

View the Kubernetes API server audit logs to investigate alerts that were triggered. As an example, a workload is likely to fail admission if global enforcement is set to the restricted pod security level.

For assistance in identifying pod security admission violation audit events, see Audit annotations in the Kubernetes documentation.

Identifying pod security violations

The PodSecurityViolation alert does not provide details on which workloads are causing pod security violations. You can identify the affected workloads by reviewing the Kubernetes API server audit logs. This procedure uses the must-gather tool to gather the audit logs and then searches for the pod-security.kubernetes.io/audit-violations annotation.

Prerequisites

  • You have installed jq.

  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. To gather the audit logs, enter the following command:

    1. $ oc adm must-gather -- /usr/bin/gather_audit_logs
  2. To output the affected workload details, enter the following command:

    1. $ zgrep -h pod-security.kubernetes.io/audit-violations must-gather.local.<archive_id>/quay*/audit_logs/kube-apiserver/*log.gz \
    2. | jq -r 'select((.annotations["pod-security.kubernetes.io/audit-violations"] != null) and (.objectRef.resource=="pods")) | .objectRef.namespace + " " + .objectRef.name + " " + .objectRef.resource' \
    3. | sort | uniq -c

    Replace must-gather.local.<archive_id> with the actual directory name.

    Example output

    1. 15 ci namespace-ttl-controller deployments
    2. 1 ci-op-k5whzrsh rpm-repo-546f98d8b replicasets
    3. 1 ci-op-k5whzrsh rpm-repo deployments

Additional resources