8.30 BART: additional details
A key element of BART is that a fresh tree is NOT fit to the current partial residual: instead, we improve the fit to the current partial residual by slightly modifying the tree obtained in the previous iteration (Step 3(a)ii)
This guards against overfitting since it limits how “hard” the data is fit in each iteration
Additionally, the individual trees are typically pretty small
BART, as the name suggests, can be viewed as a Bayesian approach to fitting an ensemble of trees:
- each time a tree is randomly perturbed to fit the residuals = drawing a new tree from a posterior distribution