Creating Go-based Operators

Operator developers can take advantage of Go programming language support in the Operator SDK to build an example Go-based Operator for Memcached, a distributed key-value store, and manage its lifecycle.

Kubebuilder is embedded into the Operator SDK as the scaffolding solution for Go-based Operators.

Creating a Go-based Operator using the Operator SDK

The Operator SDK makes it easier to build Kubernetes native applications, a process that can require deep, application-specific operational knowledge. The SDK not only lowers that barrier, but it also helps reduce the amount of boilerplate code needed for many common management capabilities, such as metering or monitoring.

This procedure walks through an example of creating a simple Memcached Operator using tools and libraries provided by the SDK.

Prerequisites

  • Operator SDK v0.19.4 CLI installed on the development workstation

  • Operator Lifecycle Manager (OLM) installed on a Kubernetes-based cluster (v1.8 or above to support the apps/v1beta2 API group), for example OKD 4.6

  • Access to the cluster using an account with cluster-admin permissions

  • OpenShift CLI (oc) v4.6+ installed

Procedure

  1. Create an Operator project:

    1. Create a directory for the project:

      1. $ mkdir -p $HOME/projects/memcached-operator
    2. Change to the directory:

      1. $ cd $HOME/projects/memcached-operator
    3. Activate support for Go modules:

      1. $ export GO111MODULE=on
    4. Run the operator-sdk init command to initialize the project:

      1. $ operator-sdk init \
      2. --domain=example.com \
      3. --repo=github.com/example-inc/memcached-operator

      The operator-sdk init command uses the go.kubebuilder.io/v2 plug-in by default.

  2. Update your Operator to use supported images:

    1. In the project root-level Dockerfile, change the default runner image reference from:

      1. FROM gcr.io/distroless/static:nonroot

      to:

      1. FROM registry.access.redhat.com/ubi8/ubi-minimal:latest
    2. Depending on the Go project version, your Dockerfile might contain a USER 65532:65532 or USER nonroot:nonroot directive. In either case, remove the line, as it is not required by the supported runner image.

    3. In the config/default/manager_auth_proxy_patch.yaml file, change the image value from:

      1. gcr.io/kubebuilder/kube-rbac-proxy:<tag>

      to use the supported image:

      1. registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.6
  3. Update the test target in your Makefile to install dependencies required during later builds by replacing the following lines:

    Existing test target

    1. test: generate fmt vet manifests
    2. go test ./... -coverprofile cover.out

    With the following lines:

    Updated test target

    1. ENVTEST_ASSETS_DIR=$(shell pwd)/testbin
    2. test: manifests generate fmt vet ## Run tests.
    3. mkdir -p ${ENVTEST_ASSETS_DIR}
    4. test -f ${ENVTEST_ASSETS_DIR}/setup-envtest.sh || curl -sSLo ${ENVTEST_ASSETS_DIR}/setup-envtest.sh https://raw.githubusercontent.com/kubernetes-sigs/controller-runtime/v0.7.2/hack/setup-envtest.sh
    5. source ${ENVTEST_ASSETS_DIR}/setup-envtest.sh; fetch_envtest_tools $(ENVTEST_ASSETS_DIR); setup_envtest_env $(ENVTEST_ASSETS_DIR); go test ./... -coverprofile cover.out
  4. Create a custom resource definition (CRD) API and controller:

    1. Run the following command to create an API with group cache, version v1, and kind Memcached:

      1. $ operator-sdk create api \
      2. --group=cache \
      3. --version=v1 \
      4. --kind=Memcached
    2. When prompted, enter y for creating both the resource and controller:

      1. Create Resource [y/n]
      2. y
      3. Create Controller [y/n]
      4. y

      Example output

      1. Writing scaffold for you to edit...
      2. api/v1/memcached_types.go
      3. controllers/memcached_controller.go
      4. ...

      This process generates the Memcached resource API at api/v1/memcached_types.go and the controller at controllers/memcached_controller.go.

    3. Modify the Go type definitions at api/v1/memcached_types.go to have the following spec and status:

      1. // MemcachedSpec defines the desired state of Memcached
      2. type MemcachedSpec struct {
      3. // +kubebuilder:validation:Minimum=0
      4. // Size is the size of the memcached deployment
      5. Size int32 `json:"size"`
      6. }
      7. // MemcachedStatus defines the observed state of Memcached
      8. type MemcachedStatus struct {
      9. // Nodes are the names of the memcached pods
      10. Nodes []string `json:"nodes"`
      11. }
    4. Add the +kubebuilder:subresource:status marker to add a status subresource to the CRD manifest:

      1. // Memcached is the Schema for the memcacheds API
      2. // +kubebuilder:subresource:status (1)
      3. type Memcached struct {
      4. metav1.TypeMeta `json:",inline"`
      5. metav1.ObjectMeta `json:"metadata,omitempty"`
      6. Spec MemcachedSpec `json:"spec,omitempty"`
      7. Status MemcachedStatus `json:"status,omitempty"`
      8. }
      1Add this line.

      This enables the controller to update the CR status without changing the rest of the CR object.

    5. Update the generated code for the resource type:

      1. $ make generate

      After you modify a *_types.go file, you must run the make generate command to update the generated code for that resource type.

      The above Makefile target invokes the controller-gen utility to update the api/v1/zz_generated.deepcopy.go file. This ensures your API Go type definitions implement the runtime.Object interface that all Kind types must implement.

  5. Generate and update CRD manifests:

    1. $ make manifests

    This Makefile target invokes the controller-gen utility to generate the CRD manifests in the config/crd/bases/cache.example.com_memcacheds.yaml file.

    1. Optional: Add custom validation to your CRD.

      OpenAPI v3.0 schemas are added to CRD manifests in the spec.validation block when the manifests are generated. This validation block allows Kubernetes to validate the properties in a Memcached custom resource (CR) when it is created or updated.

      As an Operator author, you can use annotation-like, single-line comments called Kubebuilder markers to configure custom validations for your API. These markers must always have a +kubebuilder:validation prefix. For example, adding an enum-type specification can be done by adding the following marker:

      1. // +kubebuilder:validation:Enum=Lion;Wolf;Dragon
      2. type Alias string

      Usage of markers in API code is discussed in the Kubebuilder Generating CRDs and Markers for Config/Code Generation documentation. A full list of OpenAPIv3 validation markers is also available in the Kubebuilder CRD Validation documentation.

      If you add any custom validations, run the following command to update the OpenAPI validation section for the CRD:

      1. $ make manifests
  6. After creating a new API and controller, you can implement the controller logic. For this example, replace the generated controller file controllers/memcached_controller.go with following example implementation:

    Example memcached_controller.go

    1. /*
    2. Licensed under the Apache License, Version 2.0 (the "License");
    3. you may not use this file except in compliance with the License.
    4. You may obtain a copy of the License at
    5. http://www.apache.org/licenses/LICENSE-2.0
    6. Unless required by applicable law or agreed to in writing, software
    7. distributed under the License is distributed on an "AS IS" BASIS,
    8. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    9. See the License for the specific language governing permissions and
    10. limitations under the License.
    11. */
    12. package controllers
    13. import (
    14. "context"
    15. "reflect"
    16. "github.com/go-logr/logr"
    17. appsv1 "k8s.io/api/apps/v1"
    18. corev1 "k8s.io/api/core/v1"
    19. "k8s.io/apimachinery/pkg/api/errors"
    20. metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    21. "k8s.io/apimachinery/pkg/runtime"
    22. "k8s.io/apimachinery/pkg/types"
    23. ctrl "sigs.k8s.io/controller-runtime"
    24. "sigs.k8s.io/controller-runtime/pkg/client"
    25. cachev1 "github.com/example-inc/memcached-operator/api/v1"
    26. )
    27. // MemcachedReconciler reconciles a Memcached object
    28. type MemcachedReconciler struct {
    29. client.Client
    30. Log logr.Logger
    31. Scheme *runtime.Scheme
    32. }
    33. // +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds,verbs=get;list;watch;create;update;patch;delete
    34. // +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/status,verbs=get;update;patch
    35. // +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
    36. // +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;
    37. func (r *MemcachedReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
    38. ctx := context.Background()
    39. log := r.Log.WithValues("memcached", req.NamespacedName)
    40. // Fetch the Memcached instance
    41. memcached := &cachev1.Memcached{}
    42. err := r.Get(ctx, req.NamespacedName, memcached)
    43. if err != nil {
    44. if errors.IsNotFound(err) {
    45. // Request object not found, could have been deleted after reconcile request.
    46. // Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
    47. // Return and don't requeue
    48. log.Info("Memcached resource not found. Ignoring since object must be deleted")
    49. return ctrl.Result{}, nil
    50. }
    51. // Error reading the object - requeue the request.
    52. log.Error(err, "Failed to get Memcached")
    53. return ctrl.Result{}, err
    54. }
    55. // Check if the deployment already exists, if not create a new one
    56. found := &appsv1.Deployment{}
    57. err = r.Get(ctx, types.NamespacedName{Name: memcached.Name, Namespace: memcached.Namespace}, found)
    58. if err != nil && errors.IsNotFound(err) {
    59. // Define a new deployment
    60. dep := r.deploymentForMemcached(memcached)
    61. log.Info("Creating a new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
    62. err = r.Create(ctx, dep)
    63. if err != nil {
    64. log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
    65. return ctrl.Result{}, err
    66. }
    67. // Deployment created successfully - return and requeue
    68. return ctrl.Result{Requeue: true}, nil
    69. } else if err != nil {
    70. log.Error(err, "Failed to get Deployment")
    71. return ctrl.Result{}, err
    72. }
    73. // Ensure the deployment size is the same as the spec
    74. size := memcached.Spec.Size
    75. if *found.Spec.Replicas != size {
    76. found.Spec.Replicas = &size
    77. err = r.Update(ctx, found)
    78. if err != nil {
    79. log.Error(err, "Failed to update Deployment", "Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)
    80. return ctrl.Result{}, err
    81. }
    82. // Spec updated - return and requeue
    83. return ctrl.Result{Requeue: true}, nil
    84. }
    85. // Update the Memcached status with the pod names
    86. // List the pods for this memcached's deployment
    87. podList := &corev1.PodList{}
    88. listOpts := []client.ListOption{
    89. client.InNamespace(memcached.Namespace),
    90. client.MatchingLabels(labelsForMemcached(memcached.Name)),
    91. }
    92. if err = r.List(ctx, podList, listOpts...); err != nil {
    93. log.Error(err, "Failed to list pods", "Memcached.Namespace", memcached.Namespace, "Memcached.Name", memcached.Name)
    94. return ctrl.Result{}, err
    95. }
    96. podNames := getPodNames(podList.Items)
    97. // Update status.Nodes if needed
    98. if !reflect.DeepEqual(podNames, memcached.Status.Nodes) {
    99. memcached.Status.Nodes = podNames
    100. err := r.Status().Update(ctx, memcached)
    101. if err != nil {
    102. log.Error(err, "Failed to update Memcached status")
    103. return ctrl.Result{}, err
    104. }
    105. }
    106. return ctrl.Result{}, nil
    107. }
    108. // deploymentForMemcached returns a memcached Deployment object
    109. func (r *MemcachedReconciler) deploymentForMemcached(m *cachev1.Memcached) *appsv1.Deployment {
    110. ls := labelsForMemcached(m.Name)
    111. replicas := m.Spec.Size
    112. dep := &appsv1.Deployment{
    113. ObjectMeta: metav1.ObjectMeta{
    114. Name: m.Name,
    115. Namespace: m.Namespace,
    116. },
    117. Spec: appsv1.DeploymentSpec{
    118. Replicas: &replicas,
    119. Selector: &metav1.LabelSelector{
    120. MatchLabels: ls,
    121. },
    122. Template: corev1.PodTemplateSpec{
    123. ObjectMeta: metav1.ObjectMeta{
    124. Labels: ls,
    125. },
    126. Spec: corev1.PodSpec{
    127. Containers: []corev1.Container{{
    128. Image: "memcached:1.4.36-alpine",
    129. Name: "memcached",
    130. Command: []string{"memcached", "-m=64", "-o", "modern", "-v"},
    131. Ports: []corev1.ContainerPort{{
    132. ContainerPort: 11211,
    133. Name: "memcached",
    134. }},
    135. }},
    136. },
    137. },
    138. },
    139. }
    140. // Set Memcached instance as the owner and controller
    141. ctrl.SetControllerReference(m, dep, r.Scheme)
    142. return dep
    143. }
    144. // labelsForMemcached returns the labels for selecting the resources
    145. // belonging to the given memcached CR name.
    146. func labelsForMemcached(name string) map[string]string {
    147. return map[string]string{"app": "memcached", "memcached_cr": name}
    148. }
    149. // getPodNames returns the pod names of the array of pods passed in
    150. func getPodNames(pods []corev1.Pod) []string {
    151. var podNames []string
    152. for _, pod := range pods {
    153. podNames = append(podNames, pod.Name)
    154. }
    155. return podNames
    156. }
    157. func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
    158. return ctrl.NewControllerManagedBy(mgr).
    159. For(&cachev1.Memcached{}).
    160. Owns(&appsv1.Deployment{}).
    161. Complete(r)
    162. }

    The example controller runs the following reconciliation logic for each Memcached CR:

    • Create a Memcached deployment if it does not exist.

    • Ensure that the deployment size is the same as specified by the Memcached CR spec.

    • Update the Memcached CR status with the names of the memcached pods.

    The next two sub-steps inspect how the controller watches resources and how the reconcile loop is triggered. You can skip these steps to go directly to building and running the Operator.

    1. Inspect the controller implementation at the controllers/memcached_controller.go file to see how the controller watches resources.

      The SetupWithManager() function specifies how the controller is built to watch a CR and other resources that are owned and managed by that controller:

      SetupWithManager() function

      1. import (
      2. ...
      3. appsv1 "k8s.io/api/apps/v1"
      4. ...
      5. )
      6. func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
      7. return ctrl.NewControllerManagedBy(mgr).
      8. For(&cachev1.Memcached{}).
      9. Owns(&appsv1.Deployment{}).
      10. Complete(r)
      11. }

      NewControllerManagedBy() provides a controller builder that allows various controller configurations.

      For(&cachev1.Memcached{}) specifies the Memcached type as the primary resource to watch. For each Add, Update, or Delete event for a Memcached type, the reconcile loop is sent a reconcile Request argument, which consists of a namespace and name key, for that Memcached object.

      Owns(&appsv1.Deployment{}) specifies the Deployment type as the secondary resource to watch. For each Deployment type Add, Update, or Delete event, the event handler maps each event to a reconcile request for the owner of the deployment. In this case, the owner is the Memcached object for which the deployment was created.

    2. Every controller has a reconciler object with a Reconcile() method that implements the reconcile loop. The reconcile loop is passed the Request argument, which is a namespace and name key used to find the primary resource object, Memcached, from the cache:

      Reconcile loop

      1. import (
      2. ctrl "sigs.k8s.io/controller-runtime"
      3. cachev1 "github.com/example-inc/memcached-operator/api/v1"
      4. ...
      5. )
      6. func (r *MemcachedReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
      7. // Lookup the Memcached instance for this reconcile request
      8. memcached := &cachev1.Memcached{}
      9. err := r.Get(ctx, req.NamespacedName, memcached)
      10. ...
      11. }

      Based on the return value of the Reconcile() function, the reconcile Request might be requeued, and the loop might be triggered again:

      Requeue logic

      1. // Reconcile successful - don't requeue
      2. return reconcile.Result{}, nil
      3. // Reconcile failed due to error - requeue
      4. return reconcile.Result{}, err
      5. // Requeue for any reason other than error
      6. return reconcile.Result{Requeue: true}, nil

      You can set the Result.RequeueAfter to requeue the request after a grace period:

      Requeue after grace period

      1. import "time"
      2. // Reconcile for any reason other than an error after 5 seconds
      3. return ctrl.Result{RequeueAfter: time.Second*5}, nil

      You can return Result with RequeueAfter set to periodically reconcile a CR.

      For more on reconcilers, clients, and interacting with resource events, see the Controller Runtime Client API documentation.

Additional resources

Running the Operator

There are two ways you can use the Operator SDK CLI to build and run your Operator:

  • Run locally outside the cluster as a Go program.

  • Run as a deployment on the cluster.

Prerequisites

Running locally outside the cluster

You can run your Operator project as a Go program outside of the cluster. This method is useful for development purposes to speed up deployment and testing.

Procedure

  • Run the following command to install the custom resource definitions (CRDs) in the cluster configured in your ~/.kube/config file and run the Operator as a Go program locally:

    1. $ make install run

    Example output

    1. ...
    2. 2021-01-10T21:09:29.016-0700 INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"}
    3. 2021-01-10T21:09:29.017-0700 INFO setup starting manager
    4. 2021-01-10T21:09:29.017-0700 INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
    5. 2021-01-10T21:09:29.018-0700 INFO controller-runtime.manager.controller.memcached Starting EventSource {"reconciler group": "cache.example.com", "reconciler kind": "Memcached", "source": "kind source: /, Kind="}
    6. 2021-01-10T21:09:29.218-0700 INFO controller-runtime.manager.controller.memcached Starting Controller {"reconciler group": "cache.example.com", "reconciler kind": "Memcached"}
    7. 2021-01-10T21:09:29.218-0700 INFO controller-runtime.manager.controller.memcached Starting workers {"reconciler group": "cache.example.com", "reconciler kind": "Memcached", "worker count": 1}

Running as a deployment

After creating your Go-based Operator project, you can build and run your Operator as a deployment inside a cluster.

Procedure

  1. Run the following make commands to build and push the Operator image. Modify the IMG argument in the following steps to reference a repository that you have access to. You can obtain an account for storing containers at repository sites such as Quay.io.

    1. Build the image:

      1. $ make docker-build IMG=<registry>/<user>/<image_name>:<tag>
    2. Push the image to a repository:

      1. $ make docker-push IMG=<registry>/<user>/<image_name>:<tag>

      The name and tag of the image, for example IMG=<registry>/<user>/<image_name>:<tag>, in both the commands can also be set in your Makefile. Modify the IMG ?= controller:latest value to set your default image name.

  2. Run the following command to deploy the Operator:

    1. $ make deploy IMG=<registry>/<user>/<image_name>:<tag>

    By default, this command creates a namespace with the name of your Operator project in the form <project_name>-system and is used for the deployment. This command also installs the RBAC manifests from config/rbac.

  3. Verify that the Operator is running:

    1. $ oc get deployment -n <project_name>-system

    Example output

    1. NAME READY UP-TO-DATE AVAILABLE AGE
    2. <project_name>-controller-manager 1/1 1 1 8m

Creating a custom resource

After your Operator is installed, you can test it by creating a custom resource (CR) that is now provided on the cluster by the Operator.

Prerequisites

  • Example Memcached Operator, which provides the Memcached CR, installed on a cluster

Procedure

  1. Change to the namespace where your Operator is installed. For example, if you deployed the Operator using the make deploy command:

    1. $ oc project memcached-operator-system
  2. Edit the sample Memcached CR manifest at config/samples/cache_v1_memcached.yaml to contain the following specification:

    1. apiVersion: cache.example.com/v1
    2. kind: Memcached
    3. metadata:
    4. name: memcached-sample
    5. ...
    6. spec:
    7. ...
    8. size: 3
  3. Create the CR:

    1. $ oc apply -f config/samples/cache_v1_memcached.yaml
  4. Ensure that the Memcached Operator creates the deployment for the sample CR with the correct size:

    1. $ oc get deployments

    Example output

    1. NAME READY UP-TO-DATE AVAILABLE AGE
    2. memcached-operator-controller-manager 1/1 1 1 8m
    3. memcached-sample 3/3 3 3 1m
  5. Check the pods and CR status to confirm the status is updated with the Memcached pod names.

    1. Check the pods:

      1. $ oc get pods

      Example output

      1. NAME READY STATUS RESTARTS AGE
      2. memcached-sample-6fd7c98d8-7dqdr 1/1 Running 0 1m
      3. memcached-sample-6fd7c98d8-g5k7v 1/1 Running 0 1m
      4. memcached-sample-6fd7c98d8-m7vn7 1/1 Running 0 1m
    2. Check the CR status:

      1. $ oc get memcached/memcached-sample -o yaml

      Example output

      1. apiVersion: cache.example.com/v1
      2. kind: Memcached
      3. metadata:
      4. ...
      5. name: memcached-sample
      6. ...
      7. spec:
      8. size: 3
      9. status:
      10. nodes:
      11. - memcached-sample-6fd7c98d8-7dqdr
      12. - memcached-sample-6fd7c98d8-g5k7v
      13. - memcached-sample-6fd7c98d8-m7vn7
  6. Update the deployment size.

    1. Update config/samples/cache_v1_memcached.yaml file to change the spec.size field in the Memcached CR from 3 to 5:

      1. $ oc patch memcached memcached-sample \
      2. -p '{"spec":{"size": 5}}' \
      3. --type=merge
    2. Confirm that the Operator changes the deployment size:

      1. $ oc get deployments

      Example output

      1. NAME READY UP-TO-DATE AVAILABLE AGE
      2. memcached-operator-controller-manager 1/1 1 1 10m
      3. memcached-sample 5/5 5 5 3m

Additional resources

Getting involved

This guide provides an effective demonstration of the value of the Operator Framework for building and managing Operators, but this is much more left out in the interest of brevity. The Operator Framework and its components are open source, so visit each project individually and learn what else you can do:

github.com/operator-framework

If you want to discuss your experience, have questions, or want to get involved, join the Operator Framework mailing list.