Description

outputColNames CAN NOT have the same colName with reservedColName except the selectedColName.

Parameters

Name Description Type Required? Default Value
selectedCols Names of the columns used for processing String[]
outputCol Name of the output column String
reservedCols Names of the columns to be retained in the output table String[] null

Script Example

Code

  1. class PlusOne(object):
  2. def eval(self, x):
  3. return x + 1
  4. pass
  5. source = CsvSourceBatchOp()\
  6. .setSchemaStr("sepal_length double, sepal_width double, petal_length double, petal_width double, category string")\
  7. .setFilePath("http://alink-dataset.cn-hangzhou.oss.aliyun-inc.com/csv/iris.csv")
  8. udfOp = UDFBatchOp() \
  9. .setFunc(PlusOne()) \
  10. .setResultType("DOUBLE") \
  11. .setSelectedCols(['sepal_length']) \
  12. .setOutputCol('sepal_length_t') \
  13. .setReservedCols(['sepal_width'])
  14. res = udfOp.linkFrom(source)
  15. res.firstN(10).print()

Results

  1. sepal_length_t sepal_width
  2. 0 6.0 3.2
  3. 1 7.6 3.0
  4. 2 6.4 3.9
  5. 3 6.0 2.3
  6. 4 6.1 3.5
  7. 5 6.0 2.0
  8. 6 6.5 3.5
  9. 7 7.2 3.4
  10. 8 6.6 2.7
  11. 9 7.8 2.8