9.8 Nonlinear Classification
- Many decision boundaries are not linear.
- We could fit an SVC to the data using 2p features (in the case of p features and using a quadratic form).
X1,X21,X2,X22,⋯,Xp,X2p
maxβ0,β11,β12,…,βp1,βp2ϵ1,…,ϵn,M M subject to yi(β0+p∑j=1βjixji+p∑j=1βjix2ji)≥M(1−ϵi)
ϵi≥0,n∑i=1ϵi≤C,p∑j=12∑k=1β2jk=1
- Note that in the enlarged feature space (here, with the quadratic terms), the decision boundary is linear. But in the original feature space, it is quadratic q(x)=0 (in this example), and generally the solutions are not linear.
- One could also include interaction terms, higher degree polynomials, etc., and thus the feature space could enlarge quickly and entail unmanageable computations.