8.17 Out-of-bag error estimation

  • But how do we estimate the test error of a bagged model?

  • It’s pretty straightforward:

    1. Because trees are repeatedly fit to bootstrapped subsets of observations, on average each bagged tree uses about 2/3 of the observations

    2. The leftover 1/3 not used to fit a given bagged tree are called out-of-bag (OOB) observations

    3. We can predict the response for the \(i\)th observation using each of the trees in which that observation was OOB. Gives around B/3 predictions for the \(i\)th observation (which we then average)

    4. This estimate is essentially the LOO cross-validation error for bagging (if \(B\) is large)