End-to-end Kubeflow on IBM Cloud

Running Kubeflow using IBM Cloud Kubernetes Service (IKS)

This is a guide for an end-to-end example of Kubeflow on IBM Cloud Kubernetes Service (IKS). The core steps will be to take a base Tensorflow model, modify it for distributed training, serve the resulting model with TFServing, and deploy a web application that uses the trained model.

Introduction

Overview of IKS

IBM Cloud Kubernetes Service (IKS) enables the deployment of containerized applications in Kubernetes clusters with specialized tools for management of the systems.

The IBM Cloud CLI can be used for creating, developing, and deploying cloud applications.

Here’s a list of IBM Cloud services you will use:

The model and the data

This tutorial trains a TensorFlow model on the MNIST dataset, which is the hello world for machine learning.

The MNIST dataset contains a large number of images of hand-written digits in the range 0 to 9, as well as the labels identifying the digit in each image.

After training, the model can classify incoming images into 10 categories (0 to 9) based on what it’s learned about handwritten images. In other words, you send an image to the model, and the model does its best to identify the digit shown in the image. Prediction UI

In the above screenshot, the image shows a hand-written 7. This image was the input to the model. The table below the image shows a bar graph for each classification label from 0 to 9, as output by the model. Each bar represents the probability that the image matches the respective label. Judging by this screenshot, the model seems pretty confident that this image is a 7.

The overall workflow

The following diagram shows what you accomplish by following this guide:

ML workflow for training and serving an MNIST model

In summary:

  • Setting up Kubeflow on IKS.
  • Training the model:
    • Packaging a Tensorflow program in a container.
    • Submitting a Tensorflow training (tf.train) job.
  • Using the model for prediction (inference):

It’s time to get started!

Run the MNIST Tutorial on IKS

  1. Follow the IKS instructions to deploy Kubeflow.
  2. Launch a Jupyter notebook.
    • For IBM Cloud, the default NFS storage does not support some of the Python package installation. Therefore, you need to create the notebook with the setting Don't use Persistent Storage for User's home enabled.
    • Due to the Notebook user permission issue, you need to use custom images that were working in the previous version.
      • The tutorial has been tested on image: gcr.io/kubeflow-images-public/tensorflow-1.13.1-notebook-cpu:v0.5.0
  3. Launch a terminal in Jupyter and clone the Kubeflow examples repo.

    1. git clone https://github.com/kubeflow/examples.git git_kubeflow-examples
    • Tip: When you start a terminal in Jupyter, run the command bash to start a bash terminal which is much more friendly than the default shell.
    • Tip: You can change the URL for your notebook from ‘/tree’ to ‘/lab’ to switch to using Jupyterlab.
  4. Open the notebook mnist/mnist_ibm.ipynb.
  5. Follow the notebook to train and deploy MNIST on Kubeflow.

Last modified 06.10.2020: [IBM] update out-dated links in the docs (#2260) (1216419d)