Using Device Plug-ins

What Device Plug-ins Do

Device plug-ins allow you to use a particular device type (GPU, InfiniBand, or other similar computing resources that require vendor-specific initialization and setup) in your OKD pod without needing to write custom code. The device plug-in provides a consistent and portable solution to consume hardware devices across clusters. The device plug-in provides support for these devices through an extension mechanism, which makes these devices available to containers, provides health checks of these devices, and securely shares them.

OKD supports the device plug-in API, but the device plug-in containers are supported by individual vendors.

A device plug-in is a gRPC service running on the nodes (external to atomic-openshift-node.service) that is responsible for managing specific hardware resources. Any device plug-in must support following remote procedure calls (RPCs):

  1. service DevicePlugin {
  2. // GetDevicePluginOptions returns options to be communicated with Device
  3. // Manager
  4. rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
  5. // ListAndWatch returns a stream of List of Devices
  6. // Whenever a Device state change or a Device disappears, ListAndWatch
  7. // returns the new list
  8. rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
  9. // Allocate is called during container creation so that the Device
  10. // Plug-in can run device specific operations and instruct Kubelet
  11. // of the steps to make the Device available in the container
  12. rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
  13. // PreStartContainer is called, if indicated by Device Plug-in during
  14. // registration phase, before each container start. Device plug-in
  15. // can run device specific operations such as reseting the device
  16. // before making devices available to the container
  17. rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
  18. }

Example Device Plug-ins

For easy device plug-in reference implementation, there is a stub device plug-in in the Device Manager code: vendor/k8s.io/kubernetes/pkg/kubelet/cm/deviceplugin/device_plugin_stub.go.

Methods for Deploying a Device Plug-in

  • Daemonsets are the recommended approach for device plug-in deployments.

  • Upon start, the device plug-in will try to create a UNIX domain socket at /var/lib/kubelet/device-plugin/ on the node to serve RPCs from Device Manager.

  • Since device plug-ins need to manage hardware resources, access to the host file system, as well as socket creation, they must be run in a privileged security context.

  • More specific details regarding deployment steps can be found with each device plug-in implementation.