9.8 Nonlinear Classification

  • Many decision boundaries are not linear.
  • We could fit an SVC to the data using 2p features (in the case of p features and using a quadratic form).

X1,X21,X2,X22,,Xp,X2p

maxβ0,β11,β12,,βp1,βp2ϵ1,,ϵn,M M subject to yi(β0+pj=1βjixji+pj=1βjix2ji)M(1ϵi)

ϵi0,ni=1ϵiC,pj=12k=1β2jk=1

  • Note that in the enlarged feature space (here, with the quadratic terms), the decision boundary is linear. But in the original feature space, it is quadratic q(x)=0 (in this example), and generally the solutions are not linear.
  • One could also include interaction terms, higher degree polynomials, etc., and thus the feature space could enlarge quickly and entail unmanageable computations.