Kubeflow Overview

How Kubeflow helps you organize your ML workflowThis guide introduces Kubeflow as a platform for developing and deploying amachine learning (ML) system.

Kubeflow is a platform for data scientists who want to build and experiment withML pipelines. Kubeflow is also for ML engineers and operational teams who wantto deploy ML systems to various environments for development, testing, andproduction-level serving.

Conceptual overview

Kubeflow is the ML toolkit for Kubernetes.The following diagram shows Kubeflow as a platform for arranging thecomponents of your ML system on top of Kubernetes:

An architectural overview of Kubeflow on Kubernetes

Kubeflow builds on Kubernetes as a system fordeploying, scaling, and managing complex systems.

Using the Kubeflow configuration interfaces (see below) you canspecify the ML tools required for your workflow. Then you can deploy theworkflow to various clouds, local, and on-premises platforms for experimentation andfor production use.

Introducing the ML workflow

When you develop and deploy an ML system, the ML workflow typically consists ofseveral stages. Developing an ML system is an iterative process.You need to evaluate the output of various stages of the ML workflow, and applychanges to the model and parameters when necessary to ensure the model keepsproducing the results you need.

For the sake of simplicity, the following diagramshows the workflow stages in sequence. The arrow at the end of the workflowpoints back into the flow to indicate the iterative nature of the process:

A typical machine learning workflow

Looking at the stages in more detail:

  • In the experimental phase, you develop your model based on initialassumptions, and test and update the model iteratively to produce theresults you’re looking for:

    • Identify the problem you want the ML system to solve.
    • Collect and analyze the data you need to train your ML model.
    • Choose an ML framework and algorithm, and code the initial version of yourmodel.
    • Experiment with the data and with training your model.
    • Tune the model hyperparameters to ensure the most efficient processing and themost accurate results possible.
  • In the production phase, you deploy a system that performs the followingprocesses:

    • Transform the data into the format that your training system needs.To ensure that your model behaves consistently during training andprediction, the transformation process must be the same in the experimentaland production phases.
    • Train the ML model.
    • Serve the model for online prediction or for running in batch mode.
    • Monitor the model’s performance, and feed the results into your processesfor tuning or retraining the model.

Kubeflow components in the ML workflow

The next diagram adds Kubeflow to the workflow, showing which Kubeflowcomponents are useful at each stage:

Where Kubeflow fits into a typical machine learning workflow

To learn more, read the following guides to the Kubeflow components:

  • Kubeflow includes services for spawning and managingJupyter notebooks. Use notebooks for interactive datascience and experimenting with ML workflows.

  • Kubeflow Pipelines is a platform forbuilding, deploying, and managing multi-step ML workflows based on Dockercontainers.

  • Kubeflow offers several components that you can useto build your ML training, hyperparameter tuning, and serving workloads acrossmultiple platforms.

Example of a specific ML workflow

The following diagram shows a simple example of a specific ML workflow that youcan use to train and serve a model trained on the MNIST dataset:

ML workflow for training and serving an MNIST model

For details of the workflow and to run the system yourself, see theend-to-end tutorial for Kubeflow on GCP.

Kubeflow interfaces

This section introduces the interfaces that you can use to interact withKubeflow and to build and run your ML workflows on Kubeflow.

Kubeflow user interface (UI)

The Kubeflow UI looks like this:

The Kubeflow UI

The UI offers a central dashboard that you can use to access the componentsof your Kubeflow deployment. Readhow to access the central dashboard.

Kubeflow command line interface (CLI)

Kfctl is the Kubeflow CLI that you can use to install and configureKubeflow. Read about kfctl in the guide toconfiguring Kubeflow.

The Kubernetes CLI, kubectl, is useful for running commands against yourKubeflow cluster. You can use kubectl to deploy applications, inspect and managecluster resources, and view logs. Read about kubectl in the Kubernetesdocumentation.

Kubeflow APIs and SDKs

Various components of Kubeflow offer APIs and Python SDKs. See the followingsets of reference documentation:

Next steps

See how to install Kubeflow depending onyour chosen environment (local, cloud, or on-premises).

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified 30.01.2020: Merged content of accessing-uis page with new central dash page (#1569) (840e5d6d)