Submitter
A submitter is a pluggable module in SQLFlow that is used to submit an ML job to a third party computation service.
Workflow
When a user types in an extended SQL statement, SQLFlow first parses and semantically verifies the statement. Then SQLFlow either runs the ML job locally or submits the ML job to a third party computation service.

In the latter case, SQLFlow produces a job description (TrainDescription or PredictDescription) and hands it over to the submitter. For a training SQL, SQLFlow produces TrainDescription; for prediction SQL, SQLFlow produces PredDescription. The concrete definition of the description looks like the following
type ColumnType struct {Name string // e.g. sepal_lengthDatabaseTypeName string // e.g. FLOAT}// SELECT *// FROM iris.train// TRAIN DNNClassifier// WITH// n_classes = 3,// hidden_units = [10, 20]// COLUMN sepal_length, sepal_width, petal_length, petal_width// LABEL class// INTO sqlflow_models.my_dnn_model;type TrainDescription struct {StandardSelect string // e.g. SELECT * FROM iris.trainEstimator string // e.g. DNNClassifierAttrs map[string]string // e.g. "n_classes": "3", "hidden_units": "[10, 20]"X []ColumnType // e.g. "sepal_length": "FLOAT", ...Y ColumnType // e.g. "class": "INT"ModelName string // e.g. my_dnn_model}// SELECT *// FROM iris.test// PREDICT iris.predict.class// USING sqlflow_models.my_dnn_model;type PredDescription struct {StandardSelect string // e.g. SELECT * FROM iris.testTableName string // e.g. iris.predictModelName string // e.g. my_dnn_model}
Submitter Interface
The submitter interface should provide two functions Train and Predict. The detailed definition can be the following
type Submitter interface {// Train executes a ML training job and streams job's response through writer.// A typical Train function should include// - Loading the training data// - Initializing the model// - model.train// - Saving the trained model to a persistent storageTrain(desc TrainDescription, writer PipeWriter) error// Predict executes a ML predicting job and streams job's response through writer// A typical Predict function should include// - Loading the model from a persistent storage// - Loading the prediction data// - model.predict// - Writing the prediction result to a tablePredict(desc PredictDescription, writer PipeWriter) error}
Register a submitter
A new submitter can be added as
import (".../my_submitter"".../sqlflow/sql")func main() {// ...sql.Register(my_submitter.NewSubmitter())// ...for {sql := recv()sql.Run(sql)}}
where sql.Register will put my_submitter instance to package level registry. During sql.Run, it will check whether there is a submitter registered. If there is, sql.Run will run either submitter.Train or submitter.Predict.