8.11 Classification trees

  • Very similar to a regression tree except it predicts a qualitative (vs quantitative) response

  • We predict that each observation belongs to the most commonly occurring class of training observations in the region to which it belongs

  • In the classification setting, RSS cannot be used as a criterion for making the binary splits

  • A natural alternative to RSS is the classification error rate, i.e., the fraction of the training observations in that region that do not belong to the most common class:

E=1max

where \hat{p}_{mk} is the proportion of training observations in the mth region that are from the kth class

  • However, this error rate is unsuited for tree-based classification because E does not change much as the tree grows (lacks sensitivity)