8.22 Boosting

  • Yet another approach to improve prediction accuracy from a decision tree

  • Can also be applied to many statistical learning methods for regression or classification

  • Recall that in bagging each tree is built on a bootstrap training data set

  • In boosting, each tree is grown sequentially using information from previously grown trees:

    • given the current model, we fit a decision tree to the residuals of the model (rather than the outcome Y) as the response

    • we then add this new decision tree into the fitted function (model) in order to update the residuals

    • Why? this way each tree is built on information that the previous trees were unable to ‘catch’