线性支持向量机

1 介绍

  线性支持向量机是一个用于大规模分类任务的标准方法。它的目标函数线性模型中的公式(1)。它的损失函数是合页(hinge)损失,如下所示

1.1

  默认情况下,线性支持向量机训练时使用L2正则化。线性支持向量机输出一个SVM模型。给定一个新的数据点x,模型通过w^Tx的值预测,当这个值大于0时,输出为正,否则输出为负。

  线性支持向量机并不需要核函数,要详细了解支持向量机,请参考文献【1】。

2 源码分析

2.1 实例

  1. import org.apache.spark.mllib.classification.{SVMModel, SVMWithSGD}
  2. import org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
  3. import org.apache.spark.mllib.util.MLUtils
  4. // Load training data in LIBSVM format.
  5. val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_libsvm_data.txt")
  6. // Split data into training (60%) and test (40%).
  7. val splits = data.randomSplit(Array(0.6, 0.4), seed = 11L)
  8. val training = splits(0).cache()
  9. val test = splits(1)
  10. // Run training algorithm to build the model
  11. val numIterations = 100
  12. val model = SVMWithSGD.train(training, numIterations)
  13. // Clear the default threshold.
  14. model.clearThreshold()
  15. // Compute raw scores on the test set.
  16. val scoreAndLabels = test.map { point =>
  17. val score = model.predict(point.features)
  18. (score, point.label)
  19. }
  20. // Get evaluation metrics.
  21. val metrics = new BinaryClassificationMetrics(scoreAndLabels)
  22. val auROC = metrics.areaUnderROC()
  23. println("Area under ROC = " + auROC)

2.2 训练

  和逻辑回归一样,训练过程均使用GeneralizedLinearModel中的run训练,只是训练使用的GradientUpdater不同。在线性支持向量机中,使用HingeGradient计算梯度,使用SquaredL2Updater进行更新。
它的实现过程分为4步。参加逻辑回归了解这五步的详细情况。我们只需要了解HingeGradientSquaredL2Updater的实现。

  1. class HingeGradient extends Gradient {
  2. override def compute(data: Vector, label: Double, weights: Vector): (Vector, Double) = {
  3. val dotProduct = dot(data, weights)
  4. // 我们的损失函数是 max(0, 1 - (2y - 1) (f_w(x)))
  5. // 所以梯度是 -(2y - 1)*x
  6. val labelScaled = 2 * label - 1.0
  7. if (1.0 > labelScaled * dotProduct) {
  8. val gradient = data.copy
  9. scal(-labelScaled, gradient)
  10. (gradient, 1.0 - labelScaled * dotProduct)
  11. } else {
  12. (Vectors.sparse(weights.size, Array.empty, Array.empty), 0.0)
  13. }
  14. }
  15. override def compute(
  16. data: Vector,
  17. label: Double,
  18. weights: Vector,
  19. cumGradient: Vector): Double = {
  20. val dotProduct = dot(data, weights)
  21. // 我们的损失函数是 max(0, 1 - (2y - 1) (f_w(x)))
  22. // 所以梯度是 -(2y - 1)*x
  23. val labelScaled = 2 * label - 1.0
  24. if (1.0 > labelScaled * dotProduct) {
  25. //cumGradient -= labelScaled * data
  26. axpy(-labelScaled, data, cumGradient)
  27. //损失值
  28. 1.0 - labelScaled * dotProduct
  29. } else {
  30. 0.0
  31. }
  32. }
  33. }

  线性支持向量机的训练使用L2正则化方法。

  1. class SquaredL2Updater extends Updater {
  2. override def compute(
  3. weightsOld: Vector,
  4. gradient: Vector,
  5. stepSize: Double,
  6. iter: Int,
  7. regParam: Double): (Vector, Double) = {
  8. // w' = w - thisIterStepSize * (gradient + regParam * w)
  9. // w' = (1 - thisIterStepSize * regParam) * w - thisIterStepSize * gradient
  10. //表示步长,即负梯度方向的大小
  11. val thisIterStepSize = stepSize / math.sqrt(iter)
  12. val brzWeights: BV[Double] = weightsOld.toBreeze.toDenseVector
  13. //正则化,brzWeights每行数据均乘以(1.0 - thisIterStepSize * regParam)
  14. brzWeights :*= (1.0 - thisIterStepSize * regParam)
  15. //y += x * a,即brzWeights -= gradient * thisInterStepSize
  16. brzAxpy(-thisIterStepSize, gradient.toBreeze, brzWeights)
  17. //正则化||w||_2
  18. val norm = brzNorm(brzWeights, 2.0)
  19. (Vectors.fromBreeze(brzWeights), 0.5 * regParam * norm * norm)
  20. }
  21. }

  该函数的实现规则是:

  1. w' = w - thisIterStepSize * (gradient + regParam * w)
  2. w' = (1 - thisIterStepSize * regParam) * w - thisIterStepSize * gradient

  这里thisIterStepSize表示参数沿负梯度方向改变的速率,它随着迭代次数的增多而减小。

2.3 预测

  1. override protected def predictPoint(
  2. dataMatrix: Vector,
  3. weightMatrix: Vector,
  4. intercept: Double) = {
  5. //w^Tx
  6. val margin = weightMatrix.toBreeze.dot(dataMatrix.toBreeze) + intercept
  7. threshold match {
  8. case Some(t) => if (margin > t) 1.0 else 0.0
  9. case None => margin
  10. }
  11. }

参考文献

【1】支持向量机通俗导论(理解SVM的三层境界)