Description
Stochastic Outlier Selection algorithm.
Parameters
| Name |
Description |
Type |
Required? |
Default Value |
| perplexity |
Perplexity |
Double |
|
4.0 |
| vectorCol |
Name of a vector column |
String |
✓ |
|
| predictionCol |
Column name of prediction. |
String |
✓ |
Script Example
Code
data = np.array([ ["0.0,0.0"], ["0.0,1.0"], ["1.0,0.0"], ["1.0,1.0"], ["5.0,5.0"],])df_data = pd.DataFrame({ "features": data[:, 0],})data = dataframeToOperator(df_data, schemaStr='features string', op_type='batch')sos = SosBatchOp().setVectorCol("features").setPredictionCol("outlier_score").setPerplexity(3.0)output = sos.linkFrom(data)output.print()
Results
| features |
outlier_score |
| 1.0,1.0 |
0.12396819612216292 |
| 0.0,0.0 |
0.27815186043725715 |
| 0.0,1.0 |
0.24136320497783578 |
| 1.0,0.0 |
0.24136320497783578 |
| 5.0,5.0 |
0.9998106220648153 |