Deploy using CLI

Instructions for using the CLI to deploy Kubeflow on Google Cloud Platform (GCP)

This guide describes how to use the kfctl command line interface (CLI) todeploy Kubeflow on GCP. The command line deployment gives you more control overthe deployment process and configuration than you get if you use the deploymentUI. If you’re looking for a simpler deployment procedure, see how to deployKubeflow using the deployment UI.

Before you start

Before installing Kubeflow on the command line:

  • Ensure you have installed the following tools:

    • kubectl.
    • gcloud. If you already have gcloudinstalled, run gcloud components update toget the latest version of all your installed Cloud SDK components.
  • If you’re usingCloud Shell, enableboost mode.

  • Make sure that your GCP project meets the minimum requirementsdescribed in the project setup guide.

  • If you want to use Cloud Identity-Aware Proxy (CloudIAP) for access control, follow the guideto setting up OAuth credentials.Cloud IAP is recommended for production deployments or deployments withaccess to sensitive data. Alternatively, you can use basic authenticationwith a username and password.

Prepare your environment

Follow these steps to download the kfctl binary for the Kubeflow CLI and setsome handy environment variables:

  1. tar -xvf kfctl_v0.7.1_<platform>.tar.gz
  • Log in. You only need to run this command once:
  1. gcloud auth login
  • Create user credentials. You only need to run this command once:
  1. gcloud auth application-default login
  • Create environment variables to make the deployment process easier:
  1. # Set your GCP project ID and the zone where you want to create
  2. # the Kubeflow deployment:
  3. export PROJECT=<your GCP project ID>
  4. gcloud config set project ${PROJECT}
  5. export ZONE=<your GCP zone>
  6. gcloud config set compute/zone ${ZONE}
  7. # Use the following kfctl configuration file for authentication with
  8. # Cloud IAP (recommended):
  9. export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.1.yaml"
  10. # If using Cloud IAP for authentication, create environment variables
  11. # from the OAuth client ID and secret that you obtained earlier:
  12. export CLIENT_ID=<CLIENT_ID from OAuth page>
  13. export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
  14. # Alternatively, use the following kfctl configuration if you want to use
  15. # basic authentication:
  16. export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_basic_auth.0.7.1.yaml"
  17. # If using basic authentication, create environment variables
  18. # for username and password:
  19. export KUBEFLOW_USERNAME=<your username>
  20. export KUBEFLOW_PASSWORD=<your password>
  21. # Set KF_NAME to the name of your Kubeflow deployment. You also use this
  22. # value as directory name when creating your configuration directory.
  23. # See the detailed description in the text below this code snippet.
  24. # For example, your deployment name can be 'my-kubeflow' or 'kf-test'.
  25. export KF_NAME=<your choice of name for the Kubeflow deployment>
  26. # Set the path to the base directory where you want to store one or more
  27. # Kubeflow deployments. For example, /opt/.
  28. # Then set the Kubeflow application directory for this deployment.
  29. export BASE_DIR=<path to a base directory>
  30. export KF_DIR=${BASE_DIR}/${KF_NAME}
  31. # The following command is optional. It adds the kfctl binary to your path.
  32. # If you don't add kfctl to your path, you must use the full path
  33. # each time you run kfctl.
  34. export PATH=$PATH:<path to your kfctl file>

Notes:

  • ${PROJECT} - The project ID of the GCP project where you want Kubeflowdeployed.
  • ${ZONE} - The GCP zone where you want to create the Kubeflow deployment.You can see a list of zones in theCompute Engine documentation.If you plan to use accelerators, you must choose a zone that supports thetype you want. See the guide tocustomizing your Kubeflow deployment.
  • ${CONFIG_URI} - The GitHub address of the configuration YAML file thatyou want to use to deploy Kubeflow. For GCP deployments, the followingconfigurations are available:

  • ${KF_NAME} - The name of your Kubeflow deployment.If you want a custom deployment name, specify that name here.For example, my-kubeflow or kf-test.The value of KF_NAME must consist of lower case alphanumeric characters or‘-’, and must start and end with an alphanumeric character.The value of this variable cannot be greater than 25 characters. It mustcontain just a name, not a directory path.You also use this value as directory name when creating the directory whereyour Kubeflow configurations are stored, that is, the Kubeflow applicationdirectory.

  • ${KF_DIR} - The full path to your Kubeflow application directory.

Set up and deploy Kubeflow

To set up and deploy Kubeflow using the default settings,run the kfctl apply command:

  1. mkdir -p ${KF_DIR}
  2. cd ${KF_DIR}
  3. kfctl apply -V -f ${CONFIG_URI}

Alternatively, set up your configuration for later deployment

If you want to customize your configuration before deploying Kubeflow, you canset up your configuration files first, then edit the configuration, thendeploy Kubeflow:

  • Run the kfctl build command to set up your configuration:
  1. mkdir -p ${KF_DIR}
  2. cd ${KF_DIR}
  3. kfctl build -V -f ${CONFIG_URI}
  1. export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.0.7.1.yaml

Or:

  1. export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.0.7.1.yaml
  • Run the kfctl apply command to deploy Kubeflow:
  1. kfctl apply -V -f ${CONFIG_FILE}

Check your deployment

Follow these steps to verify the deployment:

  • The deployment process creates a separate deployment for your data storage.After running kfctl apply you should notice two newdeployments:

    • {KF_NAME}-storage: This deployment has persistent volumes for yourpipelines.
    • {KF_NAME}: This deployment has all the components of Kubeflow, includinga GKE clusternamed ${KF_NAME} with Kubeflow installed.
  • When the deployment finishes, check the resources installed in the namespacekubeflow in your new cluster. To do this from the command line, first setyour kubectl credentials to point to the new cluster:
  1. gcloud container clusters get-credentials ${KF_NAME} --zone ${ZONE} --project ${PROJECT}

Then see what’s installed in the kubeflow namespace of your GKE cluster:

  1. kubectl -n kubeflow get all

Access the Kubeflow user interface (UI)

Follow these steps to access the Kubeflow central dashboard:

  • Enter the following URI into your browser address bar. It can take 20minutes for the URI to become available:
  1. https://<KF_NAME>.endpoints.<project-id>.cloud.goog/

You can run the following command to get the URI for your deployment:

  1. kubectl -n istio-system get ingress
  2. NAME HOSTS ADDRESS PORTS AGE
  3. envoy-ingress your-kubeflow-name.endpoints.your-gcp-project.cloud.goog 34.102.232.34 80 5d13h

The following command sets an environment variable named HOST to the URI:

  1. export HOST=$(kubectl -n istio-system get ingress envoy-ingress -o=jsonpath={.spec.rules[0].host})

Notes:

  • It can take 20 minutes for the URI to become available.Kubeflow needs to provision a signed SSL certificate and register a DNSname.
  • If you own or manage the domain or a subdomain withCloud DNSthen you can configure this process to be much faster.See kubeflow/kubeflow#731.

Understanding the deployment process

This section gives you more details about the kfctl configuration anddeployment process, so that you can customize your Kubeflow deployment ifnecessary.

kfctl process and configuration

The kfctl deployment process includes the following commands:

  • kfctl build - (Optional) Creates configuration files defining the variousresources in your deployment. You only need to run kfctl build if you wantto edit the resources before running kfctl apply. See the guide tocustomizing your Kubeflow deployment.
  • kfctl apply - Creates or updates the resources.
  • kfctl delete - Deletes the resources.

The kfctl deployment process applies default values to certain propertiesas follows:

  • Email address: kfctl attempts to fetch your email address from yourCloud SDK configuration. You can run gcloud config list to see the defaultemail address, which the command output lists as the account.If kfctl can’t find a valid email address, you must use theflag —email <your email address> to pass a valid email address. This emailaddress becomes an administrator in the configuration of your Kubeflowdeployment.

  • GCP project ID: kfctl attempts to fetch your project ID from yourCloud SDK configuration. You can run gcloud config list to see youractive project ID.

  • GCP zone: kfctl attempts to fetch the zone from your Cloud SDKconfiguration. You can run gcloud config list to see your active zone.

  • Kubeflow deployment name: kfctl defaults to the name of the directorywhere you run the kfctl build or kfctl apply command.

You can also explicitly set the following values in your ${CONFIG_FILE}configuration file:

  • Kubeflow deployment name
  • GCP project
  • GCP zone
  • Email address

The following snippet shows you how to set values in the configuration fileusing yq:

  1. yq w -i ${CONFIG_FILE} spec.plugins[0].spec.project ${PROJECT}
  2. yq w -i ${CONFIG_FILE} spec.plugins[0].spec.zone ${ZONE}
  3. yq w -i ${CONFIG_FILE} metadata.name ${KF_NAME}

Application layout

Your Kubeflow application directory ${KF_DIR} contains the following files anddirectories:

We recommend that you check in the contents of your ${KF_DIR} directoryinto source control.

GCP service accounts

The kfctl deployment process creates three service accounts in yourGCP project. These service accounts follow the principle of leastprivilege.The service accounts are:

  • ${KF_NAME}-admin is used for some admin tasks like configuring the loadbalancers. The principle is that this account is needed to deploy Kubeflow butnot needed to actually run jobs.
  • ${KF_NAME}-user is intended to be used by training jobs and models to accessGCP resources (Cloud Storage, BigQuery, etc.). This account has a much smallerset of privileges compared to admin.
  • ${KF_NAME}-vm is used only for the virtual machine (VM) service account. Thisaccount has the minimal permissions needed to send metrics and logs toStackdriver.

Next steps