9.6 Mathematics of the SVC
- The SVC classifies a test observation based on which side of the hyperplane it lies.
\[\text{max}_{\beta_{0}...\beta_{p}, \epsilon_{1}...\epsilon_{n}, M} \space M\] \[\text{subject to } \sum_{j=1}^{p}\beta_{j}^2 = 1\] \[y_{i}(\beta_{0} + \beta_{1}X_{i1} + \beta_{2}X_{i2} ... + \beta_{p}X_{ip}) \geq M(1 - \epsilon_{i})\] \[\epsilon_{i} \geq 0, \quad \sum_{i=1}^{n}\epsilon_{i} \leq C\]
\(C\) is a nonnegative tuning parameter, typically chosen through cross-validation, and can be thought of as the budget for margin violation by the observations.
The \(\epsilon_{i}\) are slack variables that allow individual observations to be on the wrong side of the margin or hyperplane. The \(\epsilon_{i}\) indicates where the \(i^{\text{th}}\) observation is located with regards to the margin and hyperplane.
- If \(\epsilon_{i} = 0\), the observation is on the correct side of the margin.
- If \(\epsilon_{i} > 0\), the observation is on the wrong side of margin
- If \(\epsilon_{i} > 1\), the observation is on the wrong side of the hyperplane.
Since \(C\) constrains the sum of the \(\epsilon_{i}\), it determines the number and magnitude of violations to the margin. If \(C=0\), there is no margin for violation, thus all the \(\epsilon_{1},...,\epsilon_{n} = 0\).
Note that if \(C>0\), no more than \(C\) observations can be on wrong side of hyperplane, since in these cases \(\epsilon_{i} > 1\).