8.15 Bagging
Also known as bootstrap aggregation is a general-purpose procedure for reducing the variance of a statistical learning method
It’s useful and frequently used in the context of decision trees
Recall that given a set of \(n\) independent observations \(Z_1,..., Z_n\), each with variance \(\sigma^2\), the variance of the mean \(\bar{Z}\) of the observations is given by \(\sigma^2/n\)
So, averaging a set of observations reduces variance
But, this is not practical because we generally do not have access to multiple training sets!
What to do?!