8.20 Random forests: advantages over bagging

  • Random forests overcome this problem by forcing each split to consider only a subset of the predictors (typically a random sample \(m \approx \sqrt{p}\))

  • Thus at each split, the algorithm is NOT ALLOWED to consider a majority of the available predictors (essentially \((p - m)/p\) of the splits will not even consider the strong predictor, giving other predictors a chance)

  • This decorrelates the trees and makes the average of the resulting trees less variable (more reliable)

  • Only difference between bagging and random forests is the choice of predictor subset size \(m\) at each split: if a random forest is built using \(m = p\) that’s just bagging

  • For both, we build a number of decision trees on bootstrapped training samples