8.17 Out-of-bag error estimation
But how do we estimate the test error of a bagged model?
It’s pretty straightforward:
Because trees are repeatedly fit to bootstrapped subsets of observations, on average each bagged tree uses about 2/3 of the observations
The leftover 1/3 not used to fit a given bagged tree are called out-of-bag (OOB) observations
We can predict the response for the \(i\)th observation using each of the trees in which that observation was OOB. Gives around B/3 predictions for the \(i\)th observation (which we then average)
This estimate is essentially the LOO cross-validation error for bagging (if \(B\) is large)