Description

Transform data type from Vector to Triple.

Parameters

Name Description Type Required? Default Value
handleInvalid Strategy to handle unseen token String “ERROR”
tripleColumnValueSchemaStr Schema string of the triple’s column and value column String
reservedCols Names of the columns to be retained in the output table String[] []
vectorCol Name of a vector column String

Script Example

Code

  1. import numpy as np
  2. import pandas as pd
  3. data = np.array([['1', '{"f1":"1.0","f2":"2.0"}', '$3$1:1.0 2:2.0', '1:1.0,2:2.0', '1.0,2.0', 1.0, 2.0],
  4. ['2', '{"f2":"4.0","f4":"8.0"}', '$3$1:4.0 2:8.0', '1:4.0,2:8.0', '4.0,8.0', 4.0, 8.0]])
  5. df = pd.DataFrame({"row":data[:,0], "json":data[:,1], "vec":data[:,2], "kv":data[:,3], "csv":data[:,4], "f0":data[:,5], "f1":data[:,6]})
  6. data = dataframeToOperator(df, schemaStr="row string, json string, vec string, kv string, csv string, f0 double, f1 double",op_type="stream")
  7. op = VectorToTripleStreamOp()\
  8. .setVectorCol("vec")\
  9. .setReservedCols(["row"]).setTripleColValSchemaStr("col string, val double")\
  10. .linkFrom(data)
  11. op.print()
  12. StreamOperator.execute()

Results

  1. |row|col|val|
  2. |-|-|---|
  3. |1|1|1.0|
  4. |1|2|2.0|
  5. |2|1|4.0|
  6. |2|2|8.0|