Description

Sample with given ratio with or without replacement.

Parameters

Name Description Type Required? Default Value
ratio sampling ratio, it should be in range of [0, 1] Double
withReplacement Indicates whether to enable sampling with replacement, default is without replcement Boolean false

Script Example

Script

  1. data = data = np.array([
  2. ["0,0,0"],
  3. ["0.1,0.1,0.1"],
  4. ["0.2,0.2,0.2"],
  5. ["9,9,9"],
  6. ["9.1,9.1,9.1"],
  7. ["9.2,9.2,9.2"]
  8. ])
  9. df = pd.DataFrame({"Y": data[:, 0]})
  10. # batch source
  11. inOp = dataframeToOperator(df, schemaStr='Y string', op_type='batch')
  12. sampleOp = SampleBatchOp()\
  13. .setRatio(0.3)\
  14. .setWithReplacement(False)
  15. inOp.link(sampleOp).print()

Result

Y
0,0,0
0.2,0.2,0.2