Description

Support SQL select statements which can be used in Pipeline.

Parameters

Name Description Type Required? Default Value
clause Operation clause. String

Script Example

Code

  1. import pandas as pd
  2. import numpy as np
  3. schema = "age int, name string"
  4. data = np.array([
  5. [14, "Tony"],
  6. [35, "Tommy"],
  7. [72, "Tongli"],
  8. ])
  9. df = pd.DataFrame.from_records(data)
  10. source = BatchOperator.fromDataframe(df, "age int, name string")
  11. select = Select().setClause("CASE WHEN age < 18 THEN 0 WHEN age >= 18 AND age < 60 THEN 1 ELSE 2 END AS class, name")
  12. select.transform(source).print()