Description

MinMaxScaler transforms a dataset of Vector rows, rescaling each feature to a specific range [min, max). (often [0, 1]). MinMaxScalerPredict will scale the dataset with model which trained from MaxAbsTrain.

Parameters

Name Description Type Required? Default Value
selectedCol Name of the selected column used for processing String
min Lower bound after transformation. Double 0.0
max Upper bound after transformation. Double 1.0

Script Example

Script

  1. data = np.array([["a", "10.0, 100"],\
  2. ["b", "-2.5, 9"],\
  3. ["c", "100.2, 1"],\
  4. ["d", "-99.9, 100"],\
  5. ["a", "1.4, 1"],\
  6. ["b", "-2.2, 9"],\
  7. ["c", "100.9, 1"]])
  8. df = pd.DataFrame({"col" : data[:,0], "vec" : data[:,1]})
  9. data = dataframeToOperator(df, schemaStr="col string, vec string",op_type="batch")
  10. dataStream = dataframeToOperator(df, schemaStr="col string, vec string",op_type="stream")
  11. trainOp = VectorMinMaxScalerTrainBatchOp()\
  12. .setSelectedCol("vec")
  13. model = trainOp.linkFrom(data)
  14. streamPredictOp = VectorMinMaxScalerPredictStreamOp(model)
  15. streamPredictOp.linkFrom(dataStream).print()
  16. StreamOperator.execute()

Result

col1 vec
a 0.5473107569721115,1.0
b 0.4850597609561753,0.08080808080808081
c 0.9965139442231076,0.0
d 0.0,1.0
a 0.5044820717131474,0.0
b 0.4865537848605578,0.08080808080808081
c 1.0,0.0