Description

Calculate the evaluation data for regression. The evaluation metrics are: SST: Sum of Squared for Total SSE: Sum of Squares for Error SSR: Sum of Squares for Regression R^2: Coefficient of Determination R: Multiple CorrelationBak Coeffient MSE: Mean Squared Error RMSE: Root Mean Squared Error SAE/SAD: Sum of Absolute Error/Difference MAE/MAD: Mean Absolute Error/Difference MAPE: Mean Absolute Percentage Error

Parameters

Name Description Type Required? Default Value
labelCol Name of the label column in the input table String
predictionCol Column name of prediction. String

Script Example

Code

  1. import numpy as np
  2. import pandas as pd
  3. data = np.array([
  4. [0, 0],
  5. [8, 8],
  6. [1, 2],
  7. [9, 10],
  8. [3, 1],
  9. [10, 7]
  10. ])
  11. df = pd.DataFrame({"pred": data[:, 0], "label": data[:, 1]})
  12. inOp = BatchOperator.fromDataframe(df, schemaStr='pred int, label int')
  13. metrics = EvalRegressionBatchOp().setPredictionCol("pred").setLabelCol("label").linkFrom(inOp).collectMetrics()
  14. print("Total Samples Number:", metrics.getCount())
  15. print("SSE:", metrics.getSse())
  16. print("SAE:", metrics.getSae())
  17. print("RMSE:", metrics.getRmse())
  18. print("R2:", metrics.getR2())

Results

  1. Total Samples Number: 6.0
  2. SSE: 15.0
  3. SAE: 7.0
  4. RMSE: 1.5811388300841898
  5. R2: 0.8282442748091603