Ridge Regression
Ridge regression is very similar to least squares, except that the coefcients are estimated by minimizing a slightly diferent quantity
ˆβOLS≡argminˆβ(RSS)
ˆβR≡argminˆβ(RSS+λ∑pk=1β2k)
λ tuning parameter (hyperparameter) for the shrinkage penalty
there’s one model parameter λ doesn’t shrink
- (^β0)