Visualize Results in the Pipelines UI

Visualizing the results of your pipelines component

This page shows you how to use the Kubeflow Pipelines UI to visualize outputfrom a Kubeflow Pipelines component.For details about how to build a component, see the guide tobuilding your own component.

Kubeflow Pipelines provides a new method of generating visualizations. See theguide to Python Based Visualizations.

Introduction

The Kubeflow Pipelines UI offers built-in support for several types ofvisualizations, which you can use to provide rich performance evaluation andcomparison data. To make use of this programmable UI, your pipeline componentmust write a JSON file to the component’s local filesystem. You can do this atany point during the pipeline execution.

You can view the output visualizations in the following places on the KubeflowPipelines UI:

  • The Run output tab shows the visualizations for all pipeline steps in theselected run. To open the tab in the Kubeflow Pipelines UI:

    • Click Experiments to see your current pipeline experiments.
    • Click the experiment name of the experiment that you want to view.
    • Click the run name of the run that you want to view.
    • Click the Run output tab. Output visualization from a pipeline run
  • The Artifacts tab shows the visualization for the selected pipeline step.To open the tab in the Kubeflow Pipelines UI:

    • Click Experiments to see your current pipeline experiments.
    • Click the experiment name of the experiment that you want to view.
    • Click the run name of the run that you want to view.
    • On the Graph tab, click the step representing the pipeline componentthat you want to view. The step details slide into view, showing theArtifacts tab. Table-based visualization from a pipeline component

All screenshots and code snippets on this page come from asample pipeline that you can run directly from the Kubeflow Pipelines UI.See the sample description and links below.

Writing out metadata for the output viewers

The pipeline component must write a JSON file specifying metadata for theoutput viewer(s) that you want to use for visualizing the results. The file namemust be /mlpipeline-ui-metadata.json, and the component must write the fileto the root level of the container filesystem.

The JSON specifies an array of outputs. Each outputs entry describes themetadata for an output viewer. The JSON structure looks like this:

  1. {
  2. "version": 1,
  3. "outputs": [
  4. {
  5. "type": "confusion_matrix",
  6. "format": "csv",
  7. "source": "my-dir/my-matrix.csv",
  8. "schema": [
  9. {"name": "target", "type": "CATEGORY"},
  10. {"name": "predicted", "type": "CATEGORY"},
  11. {"name": "count", "type": "NUMBER"},
  12. ],
  13. "labels": "vocab"
  14. },
  15. {
  16. ...
  17. }
  18. ]
  19. }

If the component writes such a file to its container filesystem, the KubeflowPipelines system extracts the file, and the Kubeflow Pipelines UI uses the fileto generate the specified viewer(s). The metadata specifies where to load theartifact data from. The Kubeflow Pipelines UI loads the data into memoryand renders it. Note: You should keep this data at a volume that’s manageableby the UI, for example by running a sampling step before exporting the file asan artifact.

The table below shows the available metadata fields that you can specify in theoutputs array. Each outputs entry must have a type. Depending on value oftype, other fields may also be required as described in the list of outputviewers later on the page.

Field name Description
format The format of the artifact data. The default is csv. Note: The only format currently available is csv.
header A list of strings to be used as headers for the artifact data. For example, in a table these strings are used in the first row.
labels A list of strings to be used as labels for artifact columns or rows.
predicted_col Name of the predicted column.
schema A list of {type, name} objects that specify the schema of the artifact data.
source The full path to the data. The available locations include http, https, Amazon S3, Minio, and Google Cloud Storage. The path can contain wildcards ‘*’, in which case the Kubeflow Pipelines UI concatenates the data from the matching source files. For some viewers, this field can contain inlined string data instead of a path.
storage Applies only to outputs of type markdown. See below.
target_col Name of the target column.
type Name of the viewer to be used to visualize the data. The list below shows the available types.

Available output viewers

The sections below describe the available viewer types and the requiredmetadata fields for each type.

Confusion matrix

Type: confusion_matrix

Required metadata fields:

  • format
  • labels
  • schema
  • source

The confusion_matrix viewer plots a confusion matrix visualization of the datafrom the given source path, using the schema to parse the data. The labelsprovide the names of the classes to be plotted on the x and y axes.

Example:

  1. metadata = {
  2. 'outputs' : [{
  3. 'type': 'confusion_matrix',
  4. 'format': 'csv',
  5. 'schema': [
  6. {'name': 'target', 'type': 'CATEGORY'},
  7. {'name': 'predicted', 'type': 'CATEGORY'},
  8. {'name': 'count', 'type': 'NUMBER'},
  9. ],
  10. 'source': cm_file,
  11. # Convert vocab to string because for bealean values we want "True|False" to match csv data.
  12. 'labels': list(map(str, vocab)),
  13. }]
  14. }
  15. with file_io.FileIO('/mlpipeline-ui-metadata.json', 'w') as f:
  16. json.dump(metadata, f)

Visualization on the Kubeflow Pipelines UI:

Confusion matrix visualization from a pipeline component

Markdown

Type: markdown

Required metadata fields:

  • source
  • storage

The markdown viewer renders Markdown strings on the Kubeflow Pipelines UI.The viewer can read the Markdown data from the following locations:

  • A Markdown-formatted string embedded in the source field. The value of thestorage field must be inline.
  • Markdown code in a remote file, at a path specified in the source field.The storage field can contain any value except inline.

Example:

  1. metadata = {
  2. 'outputs' : [
  3. # Markdown that is hardcoded inline
  4. {
  5. 'storage': 'inline',
  6. 'source': '# Inline Markdown\n[A link](https://www.kubeflow.org/)',
  7. 'type': 'markdown',
  8. },
  9. # Markdown that is read from a file
  10. {
  11. 'source': 'gs://your_project/your_bucket/your_markdown_file',
  12. 'type': 'markdown',
  13. }]
  14. }
  15. with file_io.FileIO('/mlpipeline-ui-metadata.json', 'w') as f:
  16. json.dump(metadata, f)

Visualization on the Kubeflow Pipelines UI:

Markdown visualization from a pipeline component

ROC curve

Type: roc

Required metadata fields:

  • format
  • schema
  • source

The roc viewer plots a receiver operating characteristic(ROC)curve using the data from the given source path. The Kubeflow Pipelines UIassumes that the schema includes three columns with the following names:

  • fpr (false positive rate)
  • tpr (true positive rate)
  • thresholds

When viewing the ROC curve, you can hover your cursor over the ROC curve to seethe threshold value used for the cursor’s closest fpr and tpr values.

Example:

  1. df_roc = pd.DataFrame({'fpr': fpr, 'tpr': tpr, 'thresholds': thresholds})
  2. roc_file = os.path.join(args.output, 'roc.csv')
  3. with file_io.FileIO(roc_file, 'w') as f:
  4. df_roc.to_csv(f, columns=['fpr', 'tpr', 'thresholds'], header=False, index=False)
  5. metadata = {
  6. 'outputs': [{
  7. 'type': 'roc',
  8. 'format': 'csv',
  9. 'schema': [
  10. {'name': 'fpr', 'type': 'NUMBER'},
  11. {'name': 'tpr', 'type': 'NUMBER'},
  12. {'name': 'thresholds', 'type': 'NUMBER'},
  13. ],
  14. 'source': roc_file
  15. }]
  16. }
  17. with file_io.FileIO('/mlpipeline-ui-metadata.json', 'w') as f:
  18. json.dump(metadata, f)

Visualization on the Kubeflow Pipelines UI:

ROC curve visualization from a pipeline component

Table

Type: table

Required metadata fields:

  • format
  • header
  • source

The table viewer builds an HTML table out of the data at the given sourcepath, where the header field specifies the values to be shown in the first rowof the table. The table supports pagination.

Example:

  1. metadata = {
  2. 'outputs' : [{
  3. 'type': 'table',
  4. 'storage': 'gcs',
  5. 'format': 'csv',
  6. 'header': [x['name'] for x in schema],
  7. 'source': prediction_results
  8. }]
  9. }
  10. with open('/mlpipeline-ui-metadata.json', 'w') as f:
  11. json.dump(metadata, f)

Visualization on the Kubeflow Pipelines UI:

Table-based visualization from a pipeline component

TensorBoard

Type: tensorboard

Required metadata Fields:

  • source

The tensorboard viewer adds a Start Tensorboard button to the output page.

When viewing the output page, you can:

  • Click Start Tensorboard to start aTensorBoard Podin your Kubeflow cluster. The button text switches to Open Tensorboard.
  • Click Open Tensorboard to open the TensorBoard interface in a new tab,pointing to the logdir data specified in the source field.

Note: The Kubeflow Pipelines UI doesn’t fully manage your TensorBoardinstances. The “Start Tensorboard” button is a convenience feature so thatyou don’t have to interrupt your workflow when looking at pipeline runs. You’reresponsible for recycling or deleting the TensorBoard Pods using your Kubernetesmanagement tools.

Example:

  1. metadata = {
  2. 'outputs' : [{
  3. 'type': 'tensorboard',
  4. 'source': args.job_dir,
  5. }]
  6. }
  7. with open('/mlpipeline-ui-metadata.json', 'w') as f:
  8. json.dump(metadata, f)

Visualization on the Kubeflow Pipelines UI:

TensorBoard option output from a pipeline component

Web app

Type: web-app

Required metadata fields:

  • source

The web-app viewer provides flexibility for rendering custom output. You canspecify an HTML file that your component creates, and the Kubeflow Pipelines UIrenders that HTML in the output page. The HTML file must be self-contained, withno references to other files in the filesystem. The HTML file can containabsolute references to files on the web. Content running inside the web app isisolated in an iframe and cannot communicate with the Kubeflow Pipelines UI.

Example:

  1. static_html_path = os.path.join(output_dir, _OUTPUT_HTML_FILE)
  2. file_io.write_string_to_file(static_html_path, rendered_template)
  3. metadata = {
  4. 'outputs' : [{
  5. 'type': 'web-app',
  6. 'storage': 'gcs',
  7. 'source': static_html_path,
  8. }]
  9. }
  10. with file_io.FileIO('/mlpipeline-ui-metadata.json', 'w') as f:
  11. json.dump(metadata, f)

Visualization on the Kubeflow Pipelines UI:

Web app output from a pipeline component

Source of examples on this page

The above examples come from the tax tip prediction sample that ispre-installed when you deploy Kubeflow.

You can run the sample by selecting[Sample] ML - TFX - Taxi Tip Prediction Model Trainer from theKubeflow Pipelines UI. For help getting started with the UI, follow theKubeflow Pipelines quickstart. The pipeline uses a number of prebuilt, reusable components, including:

Next step

See how to export metrics from yourpipeline.