8.16 Bagging (continued)

  • Cue the bootstrap, i.e., take repeated samples from the single training set

  • Generate \(B\) different bootstrapped training data sets

  • Then train our method on the \(b\)th bootstrapped training set to get \(\hat{f}^{*b}\), the prediction at a point x

  • Average all the predictions to obtain \[\hat{f}_{bag}(x) = \frac{1}{B}\sum_{b=1}^B\hat{f}^{*b}(x)\]

  • In the case of classification trees:

    • for each test observation:

      • record the class predicted by each of the \(B\) trees

      • take a majority vote: the overall prediction is the most commonly occurring class among the \(B\) predictions

NOTE: the number of trees \(B\) is not a critical parameter with bagging - a large \(B\) will not lead to overfitting