8.16 Bagging (continued)
Cue the bootstrap, i.e., take repeated samples from the single training set
Generate \(B\) different bootstrapped training data sets
Then train our method on the \(b\)th bootstrapped training set to get \(\hat{f}^{*b}\), the prediction at a point x
Average all the predictions to obtain \[\hat{f}_{bag}(x) = \frac{1}{B}\sum_{b=1}^B\hat{f}^{*b}(x)\]
In the case of classification trees:
for each test observation:
record the class predicted by each of the \(B\) trees
take a majority vote: the overall prediction is the most commonly occurring class among the \(B\) predictions
NOTE: the number of trees \(B\) is not a critical parameter with bagging - a large \(B\) will not lead to overfitting