8.27 Now, the BART algorithm
In the first iteration of the BART algorithm, all \(K\) trees are initialized to have 1 root node, with \(\hat{f}^1_k(x) = \frac{1}{nK}\sum_{i=1}^{n}y_i\)
- i.e., the mean of the response values divided by the total number of trees
Thus, for the first iteration (\(b = 1\)), the prediction for all \(K\) trees is just the mean of the response
\(\hat{f}^1(x) = \sum_{k=1}^K\hat{f}^1_k(x) = \sum_{k=1}^K\frac{1}{nK}\sum_{i=1}^{n}y_i = \frac{1}{n}\sum_{i=1}^{n}y_i\)