Description

Calculate the evaluation metrics for binary classifiction.

You can either give label column and predResult column or give label column and predDetail column. Once predDetail column is given, the predResult column is ignored.

PositiveValue is optional, if given, it will be placed at the first position in the output label Array. If not given, the labels are sorted in descending order.

Parameters

Name Description Type Required? Default Value
labelCol Name of the label column in the input table String
predictionDetailCol Column name of prediction result, it will include detailed info. String
positiveLabelValueString positive label value with string format. String null

Script Example

Code

  1. import numpy as np
  2. import pandas as pd
  3. data = np.array([
  4. ["prefix1", "{\"prefix1\": 0.9, \"prefix0\": 0.1}"],
  5. ["prefix1", "{\"prefix1\": 0.8, \"prefix0\": 0.2}"],
  6. ["prefix1", "{\"prefix1\": 0.7, \"prefix0\": 0.3}"],
  7. ["prefix0", "{\"prefix1\": 0.75, \"prefix0\": 0.25}"],
  8. ["prefix0", "{\"prefix1\": 0.6, \"prefix0\": 0.4}"]])
  9. df = pd.DataFrame({"label": data[:, 0], "detailInput": data[:, 1]})
  10. inOp = BatchOperator.fromDataframe(df, schemaStr='label string, detailInput string')
  11. metrics = EvalBinaryClassBatchOp().setLabelCol("label").setPredictionDetailCol("detailInput").linkFrom(inOp).collectMetrics()
  12. print("AUC:", metrics.getAuc())
  13. print("KS:", metrics.getKs())
  14. print("PRC:", metrics.getPrc())
  15. print("Accuracy:", metrics.getAccuracy())
  16. print("Macro Precision:", metrics.getMacroPrecision())
  17. print("Micro Recall:", metrics.getMicroRecall())
  18. print("Weighted Sensitivity:", metrics.getWeightedSensitivity())

Results

  1. AUC: 0.8333333333333333
  2. KS: 0.6666666666666666
  3. PRC: 0.9027777777777777
  4. Accuracy: 0.6
  5. Macro Precision: 0.3
  6. Micro Recall: 0.6
  7. Weighted Sensitivity: 0.6