8.5 Tree-building process (regression) | Introduction to Statistical Learning Using R Book Club

Processing math: 100%

8.5 Tree-building process (regression)

Divide the predictor space — that is, the set of possible values for $X_1,X_2, . . . ,X_p$ — into $J$ distinct and non-overlapping regions, $R_1,R_2, . . . ,R_J$

Regions can have ANY shape - they don’t have to be boxes

For every observation that falls into the region $R_j$ , we make the same prediction: the mean of the response values in $R_j$
The goal is to find regions (here boxes) $R_1, . . . ,R_J$ that minimize the $RSS$ , given by

$\mathrm{RSS}=\sum_{j=1}^{J}\sum_{i{\in}R_j}^{}(y_i - \hat{y}_{R_j})^2$

where $\hat{y}_{R_j}$ is the mean response for the training observations within the $j$ th box

Unfortunately, it is computationally infeasible to consider every possible partition of the feature space into $J$ boxes.