Device Plugins

FEATURE STATE: Kubernetes v1.10betaThis feature is currently in a beta state, meaning:

  • The version names contain beta (e.g. v2beta3).
  • Code is well tested. Enabling the feature is considered safe. Enabled by default.
  • Support for the overall feature will not be dropped, though details may change.
  • The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature.
  • Recommended for only non-business-critical uses because of potential for incompatible changes in subsequent releases. If you have multiple clusters that can be upgraded independently, you may be able to relax this restriction.
  • Please do try our beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.

Kubernetes provides a device plugin frameworkthat you can use to advertise system hardware resources to theKubeletAn agent that runs on each node in the cluster. It makes sure that containers are running in a pod..

Instead of customizing the code for Kubernetes itself, vendors can implement adevice plugin that you deploy either manually or as a DaemonSetEnsures a copy of a Pod is running across a set of nodes in a cluster..The targeted devices include GPUs, high-performance NICs, FPGAs, InfiniBand adapters,and other similar computing resources that may require vendor specific initializationand setup.

Device plugin registration

The kubelet exports a Registration gRPC service:

  1. service Registration {
  2. rpc Register(RegisterRequest) returns (Empty) {}
  3. }

A device plugin can register itself with the kubelet through this gRPC service.During the registration, the device plugin needs to send:

  • The name of its Unix socket.
  • The Device Plugin API version against which it was built.
  • The ResourceName it wants to advertise. Here ResourceName needs to follow theextended resource naming schemeas vendor-domain/resourcetype.(For example, an NVIDIA GPU is advertised as nvidia.com/gpu.)

Following a successful registration, the device plugin sends the kubelet thelist of devices it manages, and the kubelet is then in charge of advertising thoseresources to the API server as part of the kubelet node status update.For example, after a device plugin registers hardware-vendor.example/foo with the kubeletand reports two healthy devices on a node, the node status is updatedto advertise that the node has 2 “Foo” devices installed and available.

Then, users can request devices in aContainerspecification as they request other types of resources, with the following limitations:

  • Extended resources are only supported as integer resources and cannot be overcommitted.
  • Devices cannot be shared among Containers.

Suppose a Kubernetes cluster is running a device plugin that advertises resource hardware-vendor.example/fooon certain nodes. Here is an example of a pod requesting this resource to run a demo workload:

  1. ---
  2. apiVersion: v1
  3. kind: Pod
  4. metadata:
  5. name: demo-pod
  6. spec:
  7. containers:
  8. - name: demo-container-1
  9. image: k8s.gcr.io/pause:2.0
  10. resources:
  11. limits:
  12. hardware-vendor.example/foo: 2
  13. #
  14. # This Pod needs 2 of the hardware-vendor.example/foo devices
  15. # and can only schedule onto a Node that's able to satisfy
  16. # that need.
  17. #
  18. # If the Node has more than 2 of those devices available, the
  19. # remainder would be available for other Pods to use.

Device plugin implementation

The general workflow of a device plugin includes the following steps:

  • Initialization. During this phase, the device plugin performs vendor specificinitialization and setup to make sure the devices are in a ready state.

  • The plugin starts a gRPC service, with a Unix socket under host path/var/lib/kubelet/device-plugins/, that implements the following interfaces:

  1. service DevicePlugin {
  2. // ListAndWatch returns a stream of List of Devices
  3. // Whenever a Device state change or a Device disappears, ListAndWatch
  4. // returns the new list
  5. rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
  6. // Allocate is called during container creation so that the Device
  7. // Plugin can run device specific operations and instruct Kubelet
  8. // of the steps to make the Device available in the container
  9. rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
  10. }
  • The plugin registers itself with the kubelet through the Unix socket at hostpath /var/lib/kubelet/device-plugins/kubelet.sock.

  • After successfully registering itself, the device plugin runs in serving mode, during which it keepsmonitoring device health and reports back to the kubelet upon any device state changes.It is also responsible for serving Allocate gRPC requests. During Allocate, the device plugin maydo device-specific preparation; for example, GPU cleanup or QRNG initialization.If the operations succeed, the device plugin returns an AllocateResponse that contains containerruntime configurations for accessing the allocated devices. The kubelet passes this informationto the container runtime.

Handling kubelet restarts

A device plugin is expected to detect kubelet restarts and re-register itself with the newkubelet instance. In the current implementation, a new kubelet instance deletes all the existing Unix socketsunder /var/lib/kubelet/device-plugins when it starts. A device plugin can monitor the deletionof its Unix socket and re-register itself upon such an event.

Device plugin deployment

You can deploy a device plugin as a DaemonSet, as a package for your node’s operating system,or manually.

The canonical directory /var/lib/kubelet/device-plugins requires privileged access,so a device plugin must run in a privileged security context.If you’re deploying a device plugin as a DaemonSet, /var/lib/kubelet/device-pluginsmust be mounted as a VolumeA directory containing data, accessible to the containers in a pod.in the plugin’sPodSpec.

If you choose the DaemonSet approach you can rely on Kubernetes to: place the device plugin’sPod onto Nodes, to restart the daemon Pod after failure, and to help automate upgrades.

API compatibility

Kubernetes device plugin support is in beta. The API may change before stabilization,in incompatible ways. As a project, Kubernetes recommends that device plugin developers:

  • Watch for changes in future releases.
  • Support multiple versions of the device plugin API for backward/forward compatibility.

If you enable the DevicePlugins feature and run device plugins on nodes that need to be upgraded toa Kubernetes release with a newer device plugin API version, upgrade your device pluginsto support both versions before upgrading these nodes. Taking that approach willensure the continuous functioning of the device allocations during the upgrade.

Monitoring Device Plugin Resources

FEATURE STATE: Kubernetes v1.15betaThis feature is currently in a beta state, meaning:

  • The version names contain beta (e.g. v2beta3).
  • Code is well tested. Enabling the feature is considered safe. Enabled by default.
  • Support for the overall feature will not be dropped, though details may change.
  • The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature.
  • Recommended for only non-business-critical uses because of potential for incompatible changes in subsequent releases. If you have multiple clusters that can be upgraded independently, you may be able to relax this restriction.
  • Please do try our beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.

In order to monitor resources provided by device plugins, monitoring agents need to be able todiscover the set of devices that are in-use on the node and obtain metadata to describe whichcontainer the metric should be associated with. Prometheus metricsexposed by device monitoring agents should follow theKubernetes Instrumentation Guidelines,identifying containers using pod, namespace, and container prometheus labels.

The kubelet provides a gRPC service to enable discovery of in-use devices, and to provide metadatafor these devices:

  1. // PodResourcesLister is a service provided by the kubelet that provides information about the
  2. // node resources consumed by pods and containers on the node
  3. service PodResourcesLister {
  4. rpc List(ListPodResourcesRequest) returns (ListPodResourcesResponse) {}
  5. }

The gRPC service is served over a unix socket at /var/lib/kubelet/pod-resources/kubelet.sock.Monitoring agents for device plugin resources can be deployed as a daemon, or as a DaemonSet.The canonical directory /var/lib/kubelet/pod-resources requires privileged access, so monitoringagents must run in a privileged security context. If a device monitoring agent is running as aDaemonSet, /var/lib/kubelet/pod-resources must be mounted as aVolumeA directory containing data, accessible to the containers in a pod. in the plugin’sPodSpec.

Support for the “PodResources service” requires KubeletPodResources feature gate to be enabled. It is enabled by default starting with Kubernetes 1.15.

Device plugin examples

Here are some examples of device plugin implementations:

What's next

Feedback

Was this page helpful?

Thanks for the feedback. If you have a specific, answerable question about how to use Kubernetes, ask it onStack Overflow.Open an issue in the GitHub repo if you want toreport a problemorsuggest an improvement.