Description

Reduction of Multiclass Classification to Binary Classification. Performs reduction using one against all strategy. For a multiclass classification with k classes, train k models (one per class). Each example is scored against all k models and the model with highest score is picked to label the example.

Parameters

Name Description Type Required? Default Value
numClass num class of multi class train. Integer
predictionCol Column name of prediction. String
predictionDetailCol Column name of prediction result, it will include detailed info. String
reservedCols Names of the columns to be retained in the output table String[] null

Script Example

Code

  1. URL = "http://alink-dataset.cn-hangzhou.oss.aliyun-inc.com/csv/iris.csv";
  2. SCHEMA_STR = "sepal_length double, sepal_width double, petal_length double, petal_width double, category string";
  3. data = CsvSourceBatchOp().setFilePath(URL).setSchemaStr(SCHEMA_STR)
  4. lr = LogisticRegression() \
  5. .setFeatureCols(["sepal_length", "sepal_width", "petal_length", "petal_width"]) \
  6. .setLabelCol("category") \
  7. .setMaxIter(100)
  8. oneVsRest = OneVsRest().setClassifier(lr).setNumClass(3)
  9. model = oneVsRest.fit(data)
  10. model.setPredictionCol("pred_result").setPredictionDetailCol("pred_detail")
  11. model.transform(data).print()