Description

Remove duplicated records.

Parameters

Name Description Type Required? Default Value

Script Example

Code

  1. URL = "http://alink-dataset.cn-hangzhou.oss.aliyun-inc.com/csv/iris.csv"
  2. SCHEMA_STR = "sepal_length double, sepal_width double, petal_length double, petal_width double, category string";
  3. data = CsvSourceBatchOp().setFilePath(URL).setSchemaStr(SCHEMA_STR)
  4. data = data.select('category').link(DistinctBatchOp())
  5. data.print()

Result

  1. category
  2. 0 Iris-setosa
  3. 1 Iris-versicolor
  4. 2 Iris-virginica