5.8 Bias-Variance Tradeoff and k-fold Cross-Validation

• As mentioned previously, the validation approach tends to overestimate the true test error, but there is low variance in the estimate since we just have one estimate of the test error.

• Conversely, the LOOCV method has little bias, since almost all observations are used to create the models.

• But, LOOCV doesn’t shake up the data enough: the estimates from each of the CV models is highly correlated and thus their mean can have high variance.

• A better choice is k-fold CV with $$k = 5$$ or $$k = 10$$.

• Often used in modeling because it has been empirically demonstrated to yield results that do not have either too much bias or variance.