5.6 Graphical Illustration of k-fold Approach

The data set is split five times such that all observations are included in one validation set. The model is estimated on 80% of the data five different times, the predictions are made for the remaining 20%, and the test MSEs are averaged.

Figure 5.3: The data set is split five times such that all observations are included in one validation set. The model is estimated on 80% of the data five different times, the predictions are made for the remaining 20%, and the test MSEs are averaged.

  • Thus, LOOCV is a special case of k-fold cross-validation, where \(k=n\).
  • The equation for the CV statistic is below:

\[CV_{k} = \frac{1}{k}{\sum_{i=1}^{k}}MSE_{i}\]