9.12 More than Two Classes

  • The concept of separating hyperplanes does not extend naturally to more than two classes, but there are some ways around this.
  • A one-versus-one approach constructs \(K \choose 2\) SVMs, where \(K\) is the number of classes. An observation is classified to each of the \(K \choose 2\) classes, and the number of times it appears in each class is counted.
  • The \(k^\text{th}\) class might be coded as +1 versus the \((k')^\text{th}\) class is coded as -1.
  • The data point is classified to the class for which it was most often assigned in the pairwise classifications.
  • Another option is one-versus-all classification. This can be useful when there are a lot of classes.
  • \(K\) SVMs are fitted, and one of the K classes to the remaining \(K-1\) classes.
  • \(\beta_{0k}...\beta_{pk}\) denotes the parameters that results from constructing an SVM comparing the \(k\)th class (coded as +1) to the other classes (-1).
  • Assign test observation \(x^*\) to the class \(k\) for which \(\beta_{0k} + ... + \beta_{pk}x^*_{p}\) is largest.