Management cluster setup

Setting up a management cluster on Google Cloud

This guide describes how to setup a management cluster which you will use to deploy one or more instances of Kubeflow.

The management cluster is used to run Cloud Config Connector. Cloud Config Connector is a Kubernetes addon that allows you to manage Google Cloud resources through Kubernetes.

While the management cluster can be deployed in the same project as your Kubeflow cluster, typically you will want to deploy it in a separate project used for administering one or more Kubeflow instances.

Optionally, the cluster can be configured with Anthos Config Management to manage Google Cloud infrastructure using GitOps.

FAQs

  • Where is kfctl?

    kfctl is no longer being used to apply resources for Google Cloud, because required functionalities are now supported by generic tools including Make, Kustomize, kpt, and Cloud Config Connector.

  • Why do we use an extra management cluster to manage Google Cloud resources?

    The management cluster is very lightweight cluster that runs Cloud Config Connector. Cloud Config Connector makes it easier to configure Google Cloud resources using YAML and Kustomize.

For a more detailed explanation of the changes affecting Kubeflow 1.1 on Google Cloud, read kubeflow/gcp-blueprints #123.

Install the required tools

  1. Install gcloud components

    1. gcloud components install kpt anthoscli beta
    2. gcloud components update
  2. Install Kustomize v3.2.1.

    Note: Kubeflow is not compatible with Kustomize versions above 3.2.1. Read this GitHub issue for the latest status.

  3. Install yq.

Setting up the management cluster

  1. Fetch the management blueprint

    1. kpt pkg get https://github.com/kubeflow/gcp-blueprints.git/management@v1.1.0 ./
  2. Fetch the upstream manifests

    1. cd ./management
    2. make get-pkg
  3. Open up the Makefile at ./management/Makefile and edit the set-values rule to set values for the name, project, and location of your management; when you are done the section should look like

    1. set-values:
    2. kpt cfg set ./instance name NAME
    3. kpt cfg set ./instance location LOCATION
    4. kpt cfg set ./instance gcloud.core.project PROJECT
    5. kpt cfg set ./upstream/management name NAME
    6. kpt cfg set ./upstream/management location LOCATION
    7. kpt cfg set ./upstream/management gcloud.core.project PROJECT
    • Where NAME, LOCATION, PROJECT should be the actual values for your deployment
  4. Set the values

    1. make set-values
  5. Hydrate and apply the manifests to create the cluster

    1. make apply
  6. Create a kubeconfig context for the cluster

    1. make create-ctxt
  7. Install CNRM

    1. make apply-kcc
    • This will install CNRM in your cluster
    • It will create the Google Cloud service account ${NAME}-cnrm-system@${PROJECT}.iam.gserviceaccount.com

Authorize CNRM for each project

In the last step we created the GCP service account ${NAME}-cnrm-system@${PROJECT}.iam.gserviceaccount.com this is the service account that CNRM will use to create any GCP resources. Consequently you need to grant this GCP service account sufficient privileges to create the desired resources in one or more projects (called managed projects, read more).

The easiest way to do this is to grant the Google Cloud service account owner permissions on one or more projects

  1. Set the managed project

    1. kpt cfg set ./instance managed-project ${MANAGED-PROJECT}
  2. Update the policy

    1. anthoscli apply -f ./instance/managed-project/iam.yaml

References

CNRM Reference Documentation

Last modified 05.10.2020: Fix typos GKE documentation (#2253) (e1a85d47)