Description

The batch operator that predict the data using the binary gbdt model.

Parameters

Name Description Type Required? Default Value
predictionCol Column name of prediction. String
predictionDetailCol Column name of prediction result, it will include detailed info. String
reservedCols Names of the columns to be retained in the output table String[] null

Script Example

Script

  1. import numpy as np
  2. import pandas as pd
  3. from pyalink.alink import *
  4. def exampleData():
  5. return np.array([
  6. [1.0, "A", 0, 0, 0],
  7. [2.0, "B", 1, 1, 0],
  8. [3.0, "C", 2, 2, 1],
  9. [4.0, "D", 3, 3, 1]
  10. ])
  11. def sourceFrame():
  12. data = exampleData()
  13. return pd.DataFrame({
  14. "f0": data[:, 0],
  15. "f1": data[:, 1],
  16. "f2": data[:, 2],
  17. "f3": data[:, 3],
  18. "label": data[:, 4]
  19. })
  20. def batchSource():
  21. return dataframeToOperator(
  22. sourceFrame(),
  23. schemaStr='''
  24. f0 double,
  25. f1 string,
  26. f2 int,
  27. f3 int,
  28. label int
  29. ''',
  30. op_type='batch'
  31. )
  32. def streamSource():
  33. return dataframeToOperator(
  34. sourceFrame(),
  35. schemaStr='''
  36. f0 double,
  37. f1 string,
  38. f2 int,
  39. f3 int,
  40. label int
  41. ''',
  42. op_type='stream'
  43. )
  44. trainOp = (
  45. GbdtTrainBatchOp()
  46. .setLearningRate(1.0)
  47. .setNumTrees(3)
  48. .setMinSamplesPerLeaf(1)
  49. .setLabelCol('label')
  50. .setFeatureCols(['f0', 'f1', 'f2', 'f3'])
  51. )
  52. predictBatchOp = (
  53. GbdtPredictBatchOp()
  54. .setPredictionDetailCol('pred_detail')
  55. .setPredictionCol('pred')
  56. )
  57. (
  58. predictBatchOp
  59. .linkFrom(
  60. batchSource().link(trainOp),
  61. batchSource()
  62. )
  63. .print()
  64. )
  65. predictStreamOp = (
  66. GbdtPredictStreamOp(
  67. batchSource().link(trainOp)
  68. )
  69. .setPredictionDetailCol('pred_detail')
  70. .setPredictionCol('pred')
  71. )
  72. (
  73. predictStreamOp
  74. .linkFrom(
  75. streamSource()
  76. )
  77. .print()
  78. )
  79. StreamOperator.execute()

Result

Batch prediction

  1. f0 f1 f2 f3 label pred pred_detail
  2. 0 1.0 A 0 0 0 0 {"0":0.9849144951094335,"1":0.015085504890566462}
  3. 1 2.0 B 1 1 0 0 {"0":0.9849144951094335,"1":0.015085504890566462}
  4. 2 3.0 C 2 2 1 1 {"0":0.01508550489056637,"1":0.9849144951094336}
  5. 3 4.0 D 3 3 1 1 {"0":0.01508550489056637,"1":0.9849144951094336}