Description

Isotonic Regression. Implement parallelized pool adjacent violators algorithm.

Parameters

Name Description Type Required? Default Value
predictionCol Column name of prediction. String

Script Example

Code

  1. data = np.array([[0.35, 1],\
  2. [0.6, 1],\
  3. [0.55, 1],\
  4. [0.5, 1],\
  5. [0.18, 0],\
  6. [0.1, 1],\
  7. [0.8, 1],\
  8. [0.45, 0],\
  9. [0.4, 1],\
  10. [0.7, 0],\
  11. [0.02, 1],\
  12. [0.3, 0],\
  13. [0.27, 1],\
  14. [0.2, 0],\
  15. [0.9, 1]])
  16. df = pd.DataFrame({"feature" : data[:,0], "label" : data[:,1]})
  17. data = dataframeToOperator(df, schemaStr="label double, feature double",op_type="batch")
  18. trainOp = IsotonicRegTrainBatchOp()\
  19. .setFeatureCol("feature")\
  20. .setLabelCol("label")
  21. model = trainOp.linkFrom(data)
  22. predictOp = IsotonicRegPredictBatchOp().setPredictionCol("result")
  23. predictOp.linkFrom(model, data).collectToDataframe()

Results

Model
model_id model_info
0 {“vectorCol”:”\”col2\””,”featureIndex”:”0”,”featureCol”:null}
1048576 [0.02,0.3,0.35,0.45,0.5,0.7]
2097152 [0.5,0.5,0.6666666865348816,0.6666666865348816,0.75,0.75]
Prediction
col1 col2 col3 pred
1.0 0.9 1.0 0.75
0.0 0.7 1.0 0.75
1.0 0.35 1.0 0.6666666865348816
1.0 0.02 1.0 0.5
1.0 0.27 1.0 0.5
1.0 0.5 1.0 0.75
0.0 0.18 1.0 0.5
0.0 0.45 1.0 0.6666666865348816
1.0 0.8 1.0 0.75
1.0 0.6 1.0 0.75
1.0 0.4 1.0 0.6666666865348816
0.0 0.3 1.0 0.5
1.0 0.55 1.0 0.75
0.0 0.2 1.0 0.5
1.0 0.1 1.0 0.5