8.15 Bagging
Also known as bootstrap aggregation is a general-purpose procedure for reducing the variance of a statistical learning method
It’s useful and frequently used in the context of decision trees
Recall that given a set of n independent observations Z1,...,Zn, each with variance σ2, the variance of the mean ˉZ of the observations is given by σ2/n
So, averaging a set of observations reduces variance
But, this is not practical because we generally do not have access to multiple training sets!
What to do?!