Description

Naive Bayes Classifier.

We support the multinomial Naive Bayes and multinomial Naive Bayes model, a probabilistic learning method. Here, feature values of train table must be nonnegative.

Parameters

Name Description Type Required? Default Value
modelType model type : Multinomial or Bernoulli. String “Multinomial”
featureCols Names of the feature columns used for training in the input table String[] null
labelCol Name of the label column in the input table String
weightCol Name of the column indicating weight String null
vectorCol Name of a vector column String null
smoothing the smoothing factor Double 1.0

Script Example

Script

  1. data = np.array([
  2. [1.0, 1.0, 0.0, 1.0, 1],
  3. [1.0, 0.0, 1.0, 1.0, 1],
  4. [1.0, 0.0, 1.0, 1.0, 1],
  5. [0.0, 1.0, 1.0, 0.0, 0],
  6. [0.0, 1.0, 1.0, 0.0, 0],
  7. [0.0, 1.0, 1.0, 0.0, 0],
  8. [0.0, 1.0, 1.0, 0.0, 0],
  9. [1.0, 1.0, 1.0, 1.0, 1],
  10. [0.0, 1.0, 1.0, 0.0, 0]])
  11. df = pd.DataFrame({"f0": data[:, 0],
  12. "f1": data[:, 1],
  13. "f2": data[:, 2],
  14. "f3": data[:, 3],
  15. "label": data[:, 4]})
  16. df["label"] = df["label"].astype('int')
  17. batchData = dataframeToOperator(df, schemaStr='f0 double, f1 double, f2 double, f3 double, label int', op_type='batch')
  18. # load data
  19. colnames = ["f0","f1","f2", "f3"]
  20. ns = NaiveBayesTrainBatchOp().setFeatureCols(colnames).setLabelCol("label")
  21. model = batchData.link(ns)
  22. predictor = NaiveBayesPredictBatchOp().setPredictionCol("pred")
  23. predictor.linkFrom(model, batchData).print()

Result

f0 f1 f2 f3 label pred
1.0 1.0 0.0 1.0 1 1
1.0 0.0 1.0 1.0 1 1
1.0 0.0 1.0 1.0 1 1
0.0 1.0 1.0 0.0 0 0
0.0 1.0 1.0 0.0 0 0
0.0 1.0 1.0 0.0 0 0
0.0 1.0 1.0 0.0 0 0
1.0 1.0 1.0 1.0 1 1
0.0 1.0 1.0 0.0 0 0