Description

pca predict for stream data, it need a pca model which is train from PcaTrainBatchOp

Parameters

Name Description Type Required? Default Value
transformType ‘SIMPLE’ or ‘SUBMEAN’, SIMPLE is data model, SUBMEAN is (data - mean) model String “SIMPLE”
reservedCols Names of the columns to be retained in the output table String[] null
predictionCol Column name of prediction. String
vectorCol Name of a vector column String null

Script Example

Script

  1. data = np.array([
  2. [0.0,0.0,0.0],
  3. [0.1,0.2,0.1],
  4. [0.2,0.2,0.8],
  5. [9.0,9.5,9.7],
  6. [9.1,9.1,9.6],
  7. [9.2,9.3,9.9]
  8. ])
  9. df = pd.DataFrame({"x1": data[:, 0], "x2": data[:, 1], "x3": data[:, 2]})
  10. # batch source
  11. inOp = dataframeToOperator(df, schemaStr='x1 double, x2 double, x3 double', op_type='batch')
  12. trainOp = PcaTrainBatchOp()\
  13. .setK(2)\
  14. .setSelectedCols(["x1","x2","x3"])
  15. predictOp = PcaPredictBatchOp()\
  16. .setPredictionCol("pred")
  17. # batch train
  18. inOp.link(trainOp)
  19. # batch predict
  20. predictOp.linkFrom(trainOp,inOp)
  21. predictOp.print()
  22. # stream predict
  23. inStreamOp = dataframeToOperator(df, schemaStr='x1 double, x2 double, x3 double', op_type='stream')
  24. predictStreamOp = PcaPredictStreamOp(trainOp)\
  25. .setPredictionCol("pred")
  26. predictStreamOp.linkFrom(inStreamOp)
  27. predictStreamOp.print()
  28. StreamOperator.execute()

Result

x1 x2 x3 pred
9.0 9.5 9.7 3.2280384305400736,1.1516225426477789E-4
0.2 0.2 0.8 0.13565076707329407,0.09003329494282108
9.2 9.3 9.9 3.250783163664603,0.0456526246528135
9.1 9.1 9.6 3.182618319978973,0.027469531992220464
0.1 0.2 0.1 0.045855205015063565,-0.012182917696915518
0.0 0.0 0.0 0.0,0.0