OpenMLDB Node

Overview

OpenMLDB is an excellent open source machine learning database, providing a full-stack FeatureOps solution for production.

OpenMLDB task plugin used to execute tasks on OpenMLDB cluster.

Create Task

  • Click Project Management -> Project Name -> Workflow Definition, and click the Create Workflow button to enter the DAG editing page.
  • Drag from the toolbar Openmldb - 图1 task node to canvas.

Task Parameters

ParameterDescription
zookeeperOpenMLDB cluster zookeeper address, e.g. 127.0.0.1:2181.
zookeeper pathOpenMLDB cluster zookeeper path, e.g. /openmldb.
Execute ModeDetermine the init mode, offline or online. You can switch it in sql statement.
SQL statementSQL statement.
Custom parametersIt is the user-defined parameters of Python, which will replace the content with ${variable} in the script.

Task Examples

Load data

load data

We use LOAD DATA to load data into OpenMLDB cluster. We select offline here, so it will load to offline storage.

Feature extraction

fe

We use SELECT INTO to do feature extraction. We select offline here, so it will run sql on offline engine.

Environment to Prepare

Start the OpenMLDB Cluster

You should create an OpenMLDB cluster first. If in production env, please check deploy OpenMLDB.

You can follow run OpenMLDB in docker to a quick start.

Python Environment

The OpenMLDB task will use OpenMLDB Python SDK to connect OpenMLDB cluster. So you should have the Python env.

We will use python3 by default. You can set PYTHON_HOME to use your custom python env.

Make sure you have installed OpenMLDB Python SDK in the host where the worker server running, using pip install openmldb.