Description

Map index to string.

Parameters

Name Description Type Required? Default Value
modelName Name of the model String
selectedCol Name of the selected column used for processing String
reservedCols Names of the columns to be retained in the output table String[] null
outputCol Name of the output column String null

Script Example

Code

  1. data = np.array([
  2. ["football"],
  3. ["football"],
  4. ["football"],
  5. ["basketball"],
  6. ["basketball"],
  7. ["tennis"],
  8. ])
  9. df_data = pd.DataFrame({
  10. "f0": data[:, 0],
  11. })
  12. data = dataframeToOperator(df_data, schemaStr='f0 string', op_type="stream")
  13. stringIndexer = StringIndexer() \
  14. .setModelName("string_indexer_model") \
  15. .setSelectedCol("f0") \
  16. .setOutputCol("f0_indexed") \
  17. .setStringOrderType("frequency_asc")
  18. indexed = stringIndexer.fit(data).transform(data)
  19. indexToString = IndexToString() \
  20. .setModelName("string_indexer_model") \
  21. .setSelectedCol("f0_indexed") \
  22. .setOutputCol("f0_indxed_unindexed")
  23. indexToString.transform(indexed).print()

Results

  1. f0|f0_indexed|f0_indxed_unindexed
  2. --|----------|-------------------
  3. football|2|football
  4. football|2|football
  5. football|2|football
  6. basketball|1|basketball
  7. basketball|1|basketball
  8. tennis|0|tennis