Configuring system controls and interface attributes using the tuning plugin

In Linux, sysctl allows an administrator to modify kernel parameters at runtime. You can modify interface-level network sysctls using the tuning Container Network Interface (CNI) meta plugin. The tuning CNI meta plugin operates in a chain with a main CNI plugin as illustrated.

CNI plugin

The main CNI plugin assigns the interface and passes this interface to the tuning CNI meta plugin at runtime. You can change some sysctls and several interface attributes such as promiscuous mode, all-multicast mode, MTU, and MAC address in the network namespace by using the tuning CNI meta plugin.

Configuring system controls by using the tuning CNI

The following procedure configures the tuning CNI to change the interface-level network net.ipv4.conf.IFNAME.accept_redirects sysctl. This example enables accepting and sending ICMP-redirected packets. In the tuning CNI meta plugin configuration, the interface name is represented by the IFNAME token and is replaced with the actual name of the interface at runtime.

Procedure

  1. Create a network attachment definition, such as tuning-example.yaml, with the following content:

    1. apiVersion: "k8s.cni.cncf.io/v1"
    2. kind: NetworkAttachmentDefinition
    3. metadata:
    4. name: <name> (1)
    5. namespace: default (2)
    6. spec:
    7. config: '{
    8. "cniVersion": "0.4.0", (3)
    9. "name": "<name>", (4)
    10. "plugins": [{
    11. "type": "<main_CNI_plugin>" (5)
    12. },
    13. {
    14. "type": "tuning", (6)
    15. "sysctl": {
    16. "net.ipv4.conf.IFNAME.accept_redirects": "1" (7)
    17. }
    18. }
    19. ]
    20. }
    1Specifies the name for the additional network attachment to create. The name must be unique within the specified namespace.
    2Specifies the namespace that the object is associated with.
    3Specifies the CNI specification version.
    4Specifies the name for the configuration. It is recommended to match the configuration name to the name value of the network attachment definition.
    5Specifies the name of the main CNI plugin to configure.
    6Specifies the name of the CNI meta plugin.
    7Specifies the sysctl to set. The interface name is represented by the IFNAME token and is replaced with the actual name of the interface at runtime.

    An example YAML file is shown here:

    1. apiVersion: "k8s.cni.cncf.io/v1"
    2. kind: NetworkAttachmentDefinition
    3. metadata:
    4. name: tuningnad
    5. namespace: default
    6. spec:
    7. config: '{
    8. "cniVersion": "0.4.0",
    9. "name": "tuningnad",
    10. "plugins": [{
    11. "type": "bridge"
    12. },
    13. {
    14. "type": "tuning",
    15. "sysctl": {
    16. "net.ipv4.conf.IFNAME.accept_redirects": "1"
    17. }
    18. }
    19. ]
    20. }'
  2. Apply the YAML by running the following command:

    1. $ oc apply -f tuning-example.yaml

    Example output

    1. networkattachmentdefinition.k8.cni.cncf.io/tuningnad created
  3. Create a pod such as examplepod.yaml with the network attachment definition similar to the following:

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: tunepod
    5. namespace: default
    6. annotations:
    7. k8s.v1.cni.cncf.io/networks: tuningnad (1)
    8. spec:
    9. containers:
    10. - name: podexample
    11. image: centos
    12. command: ["/bin/bash", "-c", "sleep INF"]
    13. securityContext:
    14. runAsUser: 2000 (2)
    15. runAsGroup: 3000 (3)
    16. allowPrivilegeEscalation: false (4)
    17. capabilities: (5)
    18. drop: ["ALL"]
    19. securityContext:
    20. runAsNonRoot: true (6)
    21. seccompProfile: (7)
    22. type: RuntimeDefault
    1Specify the name of the configured NetworkAttachmentDefinition.
    2runAsUser controls which user ID the container is run with.
    3runAsGroup controls which primary group ID the containers is run with.
    4allowPrivilegeEscalation determines if a pod can request to allow privilege escalation. If unspecified, it defaults to true. This boolean directly controls whether the no_new_privs flag gets set on the container process.
    5capabilities permit privileged actions without giving full root access. This policy ensures all capabilities are dropped from the pod.
    6runAsNonRoot: true requires that the container will run with a user with any UID other than 0.
    7RuntimeDefault enables the default seccomp profile for a pod or container workload.
  4. Apply the yaml by running the following command:

    1. $ oc apply -f examplepod.yaml
  5. Verify that the pod is created by running the following command:

    1. $ oc get pod

    Example output

    1. NAME READY STATUS RESTARTS AGE
    2. tunepod 1/1 Running 0 47s
  6. Log in to the pod by running the following command:

    1. $ oc rsh tunepod
  7. Verify the values of the configured sysctl flags. For example, find the value net.ipv4.conf.net1.accept_redirects by running the following command:

    1. sh-4.4# sysctl net.ipv4.conf.net1.accept_redirects

    Expected output

    1. net.ipv4.conf.net1.accept_redirects = 1

Enabling all-multicast mode by using the tuning CNI

You can enable all-multicast mode by using the tuning Container Network Interface (CNI) meta plugin.

The following procedure describes how to configure the tuning CNI to enable the all-multicast mode.

Procedure

  1. Create a network attachment definition, such as tuning-example.yaml, with the following content:

    1. apiVersion: "k8s.cni.cncf.io/v1"
    2. kind: NetworkAttachmentDefinition
    3. metadata:
    4. name: <name> (1)
    5. namespace: default (2)
    6. spec:
    7. config: '{
    8. "cniVersion": "0.4.0", (3)
    9. "name": "<name>", (4)
    10. "plugins": [{
    11. "type": "<main_CNI_plugin>" (5)
    12. },
    13. {
    14. "type": "tuning", (6)
    15. "allmulti": true (7)
    16. }
    17. }
    18. ]
    19. }
    1Specifies the name for the additional network attachment to create. The name must be unique within the specified namespace.
    2Specifies the namespace that the object is associated with.
    3Specifies the CNI specification version.
    4Specifies the name for the configuration. Match the configuration name to the name value of the network attachment definition.
    5Specifies the name of the main CNI plugin to configure.
    6Specifies the name of the CNI meta plugin.
    7Changes the all-multicast mode of interface. If enabled, all multicast packets on the network will be received by the interface.

    An example YAML file is shown here:

    1. apiVersion: "k8s.cni.cncf.io/v1"
    2. kind: NetworkAttachmentDefinition
    3. metadata:
    4. name: setallmulti
    5. namespace: default
    6. spec:
    7. config: '{
    8. "cniVersion": "0.4.0",
    9. "name": "setallmulti",
    10. "plugins": [
    11. {
    12. "type": "bridge"
    13. },
    14. {
    15. "type": "tuning",
    16. "allmulti": true
    17. }
    18. ]
    19. }'
  2. Apply the settings specified in the YAML file by running the following command:

    1. $ oc apply -f tuning-allmulti.yaml

    Example output

    1. networkattachmentdefinition.k8s.cni.cncf.io/setallmulti created
  3. Create a pod with a network attachment definition similar to that specified in the following examplepod.yaml sample file:

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: allmultipod
    5. namespace: default
    6. annotations:
    7. k8s.v1.cni.cncf.io/networks: setallmulti (1)
    8. spec:
    9. containers:
    10. - name: podexample
    11. image: centos
    12. command: ["/bin/bash", "-c", "sleep INF"]
    13. securityContext:
    14. runAsUser: 2000 (2)
    15. runAsGroup: 3000 (3)
    16. allowPrivilegeEscalation: false (4)
    17. capabilities: (5)
    18. drop: ["ALL"]
    19. securityContext:
    20. runAsNonRoot: true (6)
    21. seccompProfile: (7)
    22. type: RuntimeDefault
    1Specifies the name of the configured NetworkAttachmentDefinition.
    2Specifies the user ID the container is run with.
    3Specifies which primary group ID the containers is run with.
    4Specifies if a pod can request privilege escalation. If unspecified, it defaults to true. This boolean directly controls whether the no_new_privs flag gets set on the container process.
    5Specifies the container capabilities. The drop: [“ALL”] statement indicates that all Linux capabilities are dropped from the pod, providing a more restrictive security profile.
    6Specifies that the container will run with a user with any UID other than 0.
    7Specifies the container’s seccomp profile. In this case, the type is set to RuntimeDefault. Seccomp is a Linux kernel feature that restricts the system calls available to a process, enhancing security by minimizing the attack surface.
  4. Apply the settings specified in the YAML file by running the following command:

    1. $ oc apply -f examplepod.yaml
  5. Verify that the pod is created by running the following command:

    1. $ oc get pod

    Example output

    1. NAME READY STATUS RESTARTS AGE
    2. allmultipod 1/1 Running 0 23s
  6. Log in to the pod by running the following command:

    1. $ oc rsh allmultipod
  7. List all the interfaces associated with the pod by running the following command:

    1. sh-4.4# ip link

    Example output

    1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    3. 2: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8901 qdisc noqueue state UP mode DEFAULT group default
    4. link/ether 0a:58:0a:83:00:10 brd ff:ff:ff:ff:ff:ff link-netnsid 0 (1)
    5. 3: net1@if24: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    6. link/ether ee:9b:66:a4:ec:1d brd ff:ff:ff:ff:ff:ff link-netnsid 0 (2)
    1eth0@if22 is the primary interface
    2net1@if24 is the secondary interface configured with the network-attachment-definition that supports the all-multicast mode (ALLMULTI flag)

Additional resources