Regression or Classification
image source: Sharp Sight Labs
Supervised or Unsupervised
image source: Ma Yan, et al
Loss function L(→β)=argmin→βN∑i=1(yi−β0−k∑j=1βjxij)2
L1 Regularization L(→β,λ)=argmin→β[N∑i=1(yi−β0−k∑j=1βjxij)2+λk∑j=i|βj|]
L2 Regularization L(→β,λ)=argmin→β[N∑i=1(yi−β0−k∑j=1βjxij)2+λk∑j=iβ2j]
We can approximate the expected risk over a loss function, data set, and hypothesis model h
E[L((→x,→y),h)] by taking the average over the training data
1nn∑i=1L((xi,yi),h) * formulas outlined by Professor Alexander Jung
Within a hypothesis class of similar modeling functions, we are concerned with the bias-variance tradeoff in model selection.
bias-variance tradeoff
image source: Scott Fortmann-Roe
“For more complex and high-dimensional problems with potential nonlinear dependencies between features, it’s often useful to ask: