Manipulate Kubernetes Resources as Part of a Pipeline

Overview of using the SDK to manipulate Kubernetes resources dynamically as steps of the pipeline

This page describes how to manipulate Kubernetes resources through individualKubeflow Pipelines components during a pipeline.Users may handle any Kubernetes resource, while creatingPersistent Volume ClaimsandVolume Snapshotsis rendered easy in the common case.

Kubernetes Resources

ResourceOp

This class represents a step of the pipeline which manipulates Kubernetes resources.It implementsArgo’s resource template.

This feature allows users to perform some action (get, create, apply,delete, replace, patch) on Kubernetes resources.Users are able to set conditions that denote the success or failure of thestep undertaking that action.

Linkto the corresponding Python library.

Arguments

Only most significant arguments are presented in this section.For more information, please refer to the aforementioned link to the library.

  • k8sresource: Definition of the Kubernetes resource.(_required)
  • action: Action to be performed (defaults to create).
  • mergestrategy: Merge strategy when action is patch.(_optional)
  • successcondition: Condition to denote success of the step once it is true.(_optional)
  • failurecondition: Condition to denote failure of the step once it is true.(_optional)
  • attributeoutputs: Similar to file_outputs ofkfp.dsl.ContainerOp.Maps output parameter names to JSON paths in the Kubernetes object.More on that in the following section.(_optional)

Outputs

ResourceOps can produce output parameters.They can output field values of the resource which is being manipulated.For example:

  1. job = kubernetes_client.V1Job(...)
  2. rop = kfp.dsl.ResourceOp(
  3. name="create-job",
  4. k8s_resource=job,
  5. action="create",
  6. attribute_outputs={"name": "{.metadata.name}"}
  7. )

By default, ResourceOps output the resource’s name as well as the whole resourcespecification.

Samples

For better understanding, please refer to the following samples:1


Persistent Volume Claims (PVCs)

Request the creation of PVC instances simple and fast.

VolumeOp

A ResourceOp specialized in PVC creation.

Linkto the corresponding Python library.

Arguments

The following arguments are an extension to ResourceOp arguments.If a k8s_resource is passed, then none of the following should be provided.

  • resourcename: The name of the resource which will be created.This string will be prepended with the workflow name.This may contain PipelineParams.(_required)
  • size: The requested size for the PVC.This may contain PipelineParams.(required)
  • storageclass: The storage class to be used.This may contain PipelineParams.(_optional)
  • modes: The accessModes of the PVC (defaults to RWM).Checkthis documentationfor further information.The user may find the following modes built-in:
    • VOLUME_MODE_RWO: ["ReadWriteOnce"]
    • VOLUME_MODE_RWM: ["ReadWriteMany"]
    • VOLUME_MODE_ROM: ["ReadOnlyMany"]
  • annotations: Annotations to be patched in the PVC.These may contain PipelineParams.(optional)
  • datasource: It is used to create a PVC from a VolumeSnapshot.It can be either a string or a V1TypedLocalObjectReference, and may containPipelineParams. (_Alpha feature, optional)

Outputs

Additionally to the whole specification of the resource and its name(ResourceOp defaults), a VolumeOp also outputs the storage size of thebounded Persistent Volume (as step.outputs["size"]).However, this may be empty if the storage provisioner has aWaitForFirstConsumer binding mode.This value, if not empty, is always greater than or equal to the requested size.

Useful information

  • VolumeOp steps have a .volume attribute which is a PipelineVolumereferencing the created PVC.More information on Pipeline Volumes in the following section.
  • A ContainerOp has a pvolumes argument in its constructor.This is a dictionary with mount paths as keys and volumes as values andfunctions similarly to file_outputs (which can then be used asop.outputs["key"] or op.output).For example:
  1. vop = dsl.VolumeOp(
  2. name="volume_creation",
  3. resource_name="mypvc",
  4. size="1Gi"
  5. )
  6. step1 = dsl.ContainerOp(
  7. name="step1",
  8. ...
  9. pvolumes={"/mnt": vop.volume} # Implies execution after vop
  10. )
  11. step2 = dsl.ContainerOp(
  12. name="step2",
  13. ...
  14. pvolumes={"/data": step1.pvolume, # Implies execution after step1
  15. "/mnt": dsl.PipelineVolume(pvc="existing-pvc")}
  16. )
  17. step3 = dsl.ContainerOp(
  18. name="step3",
  19. ...
  20. pvolumes={"/common": step2.pvolumes["/mnt"]} # Implies execution after step2
  21. )

PipelineVolume

Reference Kubernetes volumes easily, mount them and express dependenciesthrough them.

A PipelineVolume is essentially a Kubernetes Volume(*) carryingdependencies, supplemented with an .after() method extending them.Those dependencies can then be parsed properly by a ContainerOp, when consumedin pvolumes argument or add_pvolumes() method, to extend the dependenciesof that step.

Linkto the corresponding Python library.

(*) Inherits from V1Volume class of Kubernetes Python client.

Arguments

PipelineVolume constructor accepts all arguments V1Volume constructor does.However, name can be omitted and a pseudo-random name for that volume isgenerated instead.

Extra arguments:

  • pvc: Name of an existing PVC to be referenced by this PipelineVolume.This value can be a PipelineParam.
  • volume: Initialize a new PipelineVolume instance from an existingV1Volume, or its inherited types (e.g. PipelineVolume).

Samples

For better understanding, please refer to the following samples:1,2,3,4


Volume Snapshots

Request the creation of Volume Snapshot instances simple and fast.

VolumeSnapshotOp

A ResourceOp specialized in Volume Snapshot creation.

Linkto the corresponding Python library.

NOTE: You should check if your Kubernetes cluster admin has Volume Snapshotsenabled in your cluster.

Arguments

The following arguments are an extension to the ResourceOp arguments.If a k8s_resource is passed, then none of the following may be provided.

  • resourcename: The name of the resource which will be created.This string will be prepended with the workflow name.This may contain PipelineParams.(_required)
  • pvc: The name of the PVC to be snapshotted.This may contain PipelineParams.(optional)
  • snapshotclass: The snapshot storage class to be used.This may contain PipelineParams.(_optional)
  • volume: An instance of a V1Volume, or its inherited type (e.g.PipelineVolume).This may contain PipelineParams.(optional)
  • annotations: Annotations to be patched in the VolumeSnapshot.These may contain PipelineParams.(optional)

NOTE: One of the pvc or volume needs to be provided.

Outputs

Additionally to the whole specification of the resource and its name(ResourceOp defaults), a VolumeSnapshotOp also outputs the restoreSize ofthe bounded VolumeSnapshot (as step.outputs["size"]).This is the minimum size for a PVC clone of that snapshot.

Useful information

VolumeSnapshotOp steps have a .snapshot attribute which is aV1TypedLocalObjectReference.This can be passed as a data_source to create a PVC out of thatVolumeSnapshot.The user may otherwise use the step.outputs["name"] as data_source.

Samples

For better understanding, please refer to the following samples:1,2

Next steps