Description

Isotonic Regression. Implement parallelized pool adjacent violators algorithm. Support single feature input or vector input(extractor one index of the vector).

Parameters

Name Description Type Required? Default Value
featureCol Name of the feature column。 String null
isotonic If true, the output sequence should be increasing! Boolean true
featureIndex Feature index in the vector. Integer 0
labelCol Name of the label column in the input table String
weightCol Name of the column indicating weight String null
vectorCol Name of a vector column String null
predictionCol Column name of prediction. String

Script Example

Code

  1. data = np.array([[0.35, 1],\
  2. [0.6, 1],\
  3. [0.55, 1],\
  4. [0.5, 1],\
  5. [0.18, 0],\
  6. [0.1, 1],\
  7. [0.8, 1],\
  8. [0.45, 0],\
  9. [0.4, 1],\
  10. [0.7, 0],\
  11. [0.02, 1],\
  12. [0.3, 0],\
  13. [0.27, 1],\
  14. [0.2, 0],\
  15. [0.9, 1]])
  16. df = pd.DataFrame({"feature" : data[:,0], "label" : data[:,1]})
  17. data = dataframeToOperator(df, schemaStr="label double, feature double",op_type="batch")
  18. res = IsotonicRegression()\
  19. .setFeatureCol("feature")\
  20. .setLabelCol("label").setPredictionCol("result")
  21. res.fit(data).transform(data).collectToDataframe()

Results

Model
model_id model_info
0 {“vectorCol”:”\”col2\””,”featureIndex”:”0”,”featureCol”:null}
1048576 [0.02,0.3,0.35,0.45,0.5,0.7]
2097152 [0.5,0.5,0.6666666865348816,0.6666666865348816,0.75,0.75]
Prediction
col1 col2 col3 pred
1.0 0.9 1.0 0.75
0.0 0.7 1.0 0.75
1.0 0.35 1.0 0.6666666865348816
1.0 0.02 1.0 0.5
1.0 0.27 1.0 0.5
1.0 0.5 1.0 0.75
0.0 0.18 1.0 0.5
0.0 0.45 1.0 0.6666666865348816
1.0 0.8 1.0 0.75
1.0 0.6 1.0 0.75
1.0 0.4 1.0 0.6666666865348816
0.0 0.3 1.0 0.5
1.0 0.55 1.0 0.75
0.0 0.2 1.0 0.5
1.0 0.1 1.0 0.5