Advanced DaemonSet

This controller enhances the rolling update workflow of Kubernetes DaemonSet controller from aspects, such as partition, selector, pause strategies.

If you don’t know much about the Kubernetes DaemonSet, we strongly recommend you read its documents before learning Advanced DaemonSet.

Note that Advanced DaemonSet extends the same CRD schema of default DaemonSet with newly added fields. The CRD kind name is still DaemonSet. This is done on purpose so that user can easily migrate workload to the Advanced DaemonSet from the default DaemonSet. For example, one may simply replace the value of apiVersion in the DaemonSet yaml file from apps/v1 to apps.kruise.io/v1alpha1 after installing Kruise manager.

  1. - apiVersion: apps/v1
  2. + apiVersion: apps.kruise.io/v1alpha1
  3. kind: DaemonSet
  4. metadata:
  5. name: sample-ds
  6. spec:
  7. #...

Enhanced strategies

These new fields have been added into RollingUpdateDaemonSet:

  1. const (
  2. + // StandardRollingUpdateType replace the old daemons by new ones using rolling update i.e replace them on each node one after the other.
  3. + // this is the default type for RollingUpdate.
  4. + StandardRollingUpdateType RollingUpdateType = "Standard"
  5. + // InplaceRollingUpdateType update container image without killing the pod if possible.
  6. + InplaceRollingUpdateType RollingUpdateType = "InPlaceIfPossible"
  7. )
  8. // Spec to control the desired behavior of daemon set rolling update.
  9. type RollingUpdateDaemonSet struct {
  10. + // Type is to specify which kind of rollingUpdate.
  11. + Type RollingUpdateType `json:"rollingUpdateType,omitempty" protobuf:"bytes,1,opt,name=rollingUpdateType"`
  12. // ...
  13. MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty" protobuf:"bytes,2,opt,name=maxUnavailable"`
  14. // ...
  15. MaxSurge *intstr.IntOrString `json:"maxSurge,omitempty" protobuf:"bytes,7,opt,name=maxSurge"`
  16. + // A label query over nodes that are managed by the daemon set RollingUpdate.
  17. + // Must match in order to be controlled.
  18. + // It must match the node's labels.
  19. + Selector *metav1.LabelSelector `json:"selector,omitempty" protobuf:"bytes,3,opt,name=selector"`
  20. + // The number of DaemonSet pods remained to be old version.
  21. + // Default value is 0.
  22. + // Maximum value is status.DesiredNumberScheduled, which means no pod will be updated.
  23. + // +optional
  24. + Partition *int32 `json:"partition,omitempty" protobuf:"varint,4,opt,name=partition"`
  25. + // Indicates that the daemon set is paused and will not be processed by the
  26. + // daemon set controller.
  27. + // +optional
  28. + Paused *bool `json:"paused,omitempty" protobuf:"varint,5,opt,name=paused"`
  29. }

Type for rolling update

Advanced DaemonSet has a rollingUpdateType field in spec.updateStrategy.rollingUpdate which controls the way to rolling update.

  • Standard (default): controller will update daemon Pods by recreating them. It is the same behavior as upstream DaemonSet. You can use maxUnavailable or maxSurge to control order of recreating old and new pods.
  • InPlaceIfPossible: controller will try to in-place update Pod instead of recreating them if possible. You may need to read the concept doc for more details of in-place update. Note that in this type, you can only use maxUnavailable without maxSurge.
  1. apiVersion: apps.kruise.io/v1alpha1
  2. kind: DaemonSet
  3. spec:
  4. # ...
  5. updateStrategy:
  6. type: RollingUpdate
  7. rollingUpdate:
  8. rollingUpdateType: Standard

Selector for rolling update

It helps users to update Pods on specific nodes whose labels could be matched with the selector.

  1. apiVersion: apps.kruise.io/v1alpha1
  2. kind: DaemonSet
  3. spec:
  4. # ...
  5. updateStrategy:
  6. type: RollingUpdate
  7. rollingUpdate:
  8. selector:
  9. matchLabels:
  10. nodeType: canary

Partition for rolling update and scaling up

This strategy defines rules for calculating the priority of updating pods. Partition is the number of DaemonSet pods that should be remained to be old version.

  1. apiVersion: apps.kruise.io/v1alpha1
  2. kind: DaemonSet
  3. spec:
  4. # ...
  5. updateStrategy:
  6. type: RollingUpdate
  7. rollingUpdate:
  8. partition: 10

And if you put daemonset.kruise.io/progressive-create-pod: "true" annotation into Advanced DaemonSet, the partition will also control the number of pods to be created when scaling up.

Paused for rolling update

paused indicates that Pods updating is paused, controller will not update Pods but just maintain the number of replicas.

  1. apiVersion: apps.kruise.io/v1alpha1
  2. kind: DaemonSet
  3. spec:
  4. # ...
  5. updateStrategy:
  6. rollingUpdate:
  7. paused: true

Lifecycle hook

FEATURE STATE: Kruise v1.1.0

This is similar to Lifecycle hook of CloneSet.

Now Advanced DaemonSet only supports PreDelete hook, which means it allows users to do something (for example check node resources) before Pod deleting.

  1. type LifecycleStateType string
  2. // Lifecycle contains the hooks for Pod lifecycle.
  3. type Lifecycle struct {
  4. // PreDelete is the hook before Pod to be deleted.
  5. PreDelete *LifecycleHook `json:"preDelete,omitempty"`
  6. }
  7. type LifecycleHook struct {
  8. LabelsHandler map[string]string `json:"labelsHandler,omitempty"`
  9. FinalizersHandler []string `json:"finalizersHandler,omitempty"`
  10. }

Examples:

  1. apiVersion: apps.kruise.io/v1alpha1
  2. kind: DaemonSet
  3. spec:
  4. # define with label
  5. lifecycle:
  6. preDelete:
  7. labelsHandler:
  8. example.io/block-deleting: "true"
  • When Advanced DaemonSet delete a Pod (including scale in and recreate update):
    • Delete it directly if no lifecycle hook definition or Pod not matched preDelete hook
    • Otherwise, Advanced DaemonSet will firstly update Pod to PreparingDelete state and wait for user controller to remove the label/finalizer and Pod not matched preDelete hook
  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. labels:
  5. example.io/block-deleting: "true" # the pod is hooked by PreDelete hook label
  6. lifecycle.apps.kruise.io/state: PreparingDelete # so we update it to `PreparingDelete` state and wait for user controller to do something and remove the label

Example for user controller logic

Same as yaml example above, we should fisrtly define example.io/block-deleting label in template and lifecycle of Advanced DaemonSet.

  1. apiVersion: apps.kruise.io/v1alpha1
  2. kind: DaemonSet
  3. spec:
  4. template:
  5. metadata:
  6. labels:
  7. example.io/block-deleting: "true"
  8. # ...
  9. lifecycle:
  10. preDelete:
  11. labelsHandler:
  12. example.io/block-deleting: "true"

User controller logic:

  • For Pod in PreparingDelete, check if its Node existing, do something (for example reserve resources) and then remove the label.