8.28 BART algorithm: iteration 2 and on
In subsequent iterations, BART updates each of the \(K\) trees one at a time
In the \(b\)th iteration to update the \(k\)th tree, we subtract from each response value the predictions from all but the \(k\)th tree, to obtain a partial residual:
\(r_i = y_i - \sum_{k'<k}\hat{f}^b_{k'}(x_i) - \sum_{k'>k}\hat{f}^{b-1}_{k'}(x_i)\)
for the \(i\)th observation, \(i = 1, …, n\)
Rather than fitting a new tree to this partial residual, BART chooses a perturbation to the tree from a previous iteration \(\hat{f}^{b-1}_{k}\) favoring perturbations that improve the fit to the partial residual
To perturb trees:
change the structure of the tree by adding/pruning branches
change the prediction in each terminal node of the tree
The output of BART is a collection of prediction models:
\(\hat{f}^b(x) = \sum_{k=1}^{K}\hat{f}^b_k(x)\)
for \(b = 1, 2,…, B\)