Description

MaxAbsScaler transforms a dataset of Vector rows,rescaling each feature to range [-1, 1] by dividing through the maximum absolute value in each feature. MaxAbsPredict will scale the dataset with model which trained from MaxAbsTrain.

Parameters

Name Description Type Required? Default Value
selectedCols Names of the columns used for processing String[]

Script Example

Script

  1. data = np.array([
  2. ["a", 10.0, 100],
  3. ["b", -2.5, 9],
  4. ["c", 100.2, 1],
  5. ["d", -99.9, 100],
  6. ["a", 1.4, 1],
  7. ["b", -2.2, 9],
  8. ["c", 100.9, 1]
  9. ])
  10. colnames = ["col1", "col2", "col3"]
  11. selectedColNames = ["col2", "col3"]
  12. df = pd.DataFrame({"col1": data[:, 0], "col2": data[:, 1], "col3": data[:, 2]})
  13. inOp = dataframeToOperator(df, schemaStr='col1 string, col2 double, col3 long', op_type='batch')
  14. # train
  15. trainOp = MaxAbsScalerTrainBatchOp()\
  16. .setSelectedCols(selectedColNames)
  17. trainOp.linkFrom(inOp)
  18. # batch predict
  19. predictOp = MaxAbsScalerPredictBatchOp()
  20. predictOp.linkFrom(trainOp, inOp).print()
  21. # stream predict
  22. sinOp = dataframeToOperator(df, schemaStr='col1 string, col2 double, col3 long', op_type='stream')
  23. predictStreamOp = MaxAbsScalerPredictStreamOp(trainOp)
  24. predictStreamOp.linkFrom(sinOp).print()
  25. StreamOperator.execute()

Results

  1. col1 col2 col3
  2. 0 a 0.099108 1.00
  3. 1 b -0.024777 0.09
  4. 2 c 0.993062 0.01
  5. 3 d -0.990089 1.00
  6. 4 a 0.013875 0.01
  7. 5 b -0.021804 0.09
  8. 6 c 1.000000 0.01