Description

outputColNames CAN NOT have the same colName with keepOriginColName except the selectedColName.

Parameters

Name Description Type Required? Default Value
selectedCols Names of the columns used for processing String[]
outputCols Names of the output columns String[]
reservedCols Names of the columns to be retained in the output table String[] null

Script Example

Code

  1. source = CsvSourceStreamOp()\
  2. .setSchemaStr("sepal_length double, sepal_width double, petal_length double, petal_width double, category string")\
  3. .setFilePath("http://alink-dataset.cn-hangzhou.oss.aliyun-inc.com/csv/iris.csv")
  4. udtfOp = UDTFStreamOp()\
  5. .setFunc(lambda x, y: [ (yield x + 1 + i, y + 2 + i) for i in range(3) ])\
  6. .setResultTypes(["DOUBLE", "DOUBLE"])\
  7. .setSelectedCols(['sepal_length', 'sepal_width'])\
  8. .setOutputCols(['index', 'x'])\
  9. .setReservedCols(['sepal_length', 'sepal_width'])
  10. res = udtfOp.linkFrom(source)
  11. res.print()
  12. StreamOperator.execute()

Results

  1. sepal_length sepal_width index x
  2. 1 5.2 4.1 6.2 6.1
  3. 2 5.2 4.1 7.2 7.1
  4. 3 5.2 4.1 8.2 8.1
  5. 4 5.5 2.6 6.5 4.6
  6. 5 5.5 2.6 7.5 5.6
  7. ... ... ... ... ...
  8. 96 5.7 4.4 7.7 7.4
  9. 97 5.7 4.4 8.7 8.4
  10. 98 6.5 3.0 7.5 5.0
  11. 99 6.5 3.0 8.5 6.0
  12. 100 6.5 3.0 9.5 7.0