9.6 Mathematics of the SVC

  • The SVC classifies a test observation based on which side of the hyperplane it lies.

\[\text{max}_{\beta_{0}...\beta_{p}, \epsilon_{1}...\epsilon_{n}, M} \space M\] \[\text{subject to } \sum_{j=1}^{p}\beta_{j}^2 = 1\] \[y_{i}(\beta_{0} + \beta_{1}X_{i1} + \beta_{2}X_{i2} ... + \beta_{p}X_{ip}) \geq M(1 - \epsilon_{i})\] \[\epsilon_{i} \geq 0, \quad \sum_{i=1}^{n}\epsilon_{i} \leq C\]

  • \(C\) is a nonnegative tuning parameter, typically chosen through cross-validation, and can be thought of as the budget for margin violation by the observations.

  • The \(\epsilon_{i}\) are slack variables that allow individual observations to be on the wrong side of the margin or hyperplane. The \(\epsilon_{i}\) indicates where the \(i^{\text{th}}\) observation is located with regards to the margin and hyperplane.

    • If \(\epsilon_{i} = 0\), the observation is on the correct side of the margin.
    • If \(\epsilon_{i} > 0\), the observation is on the wrong side of margin
    • If \(\epsilon_{i} > 1\), the observation is on the wrong side of the hyperplane.
  • Since \(C\) constrains the sum of the \(\epsilon_{i}\), it determines the number and magnitude of violations to the margin. If \(C=0\), there is no margin for violation, thus all the \(\epsilon_{1},...,\epsilon_{n} = 0\).

  • Note that if \(C>0\), no more than \(C\) observations can be on wrong side of hyperplane, since in these cases \(\epsilon_{i} > 1\).