9.7 - Final thoughts

Decision trees have a number of advantages:

  • They require very little pre-processing.

  • Can easily handle categorical features without preprocessing.

  • Missing values can be handled by decision trees by creating a new “missing” class for categorical variables or using surrogate splits (see Therneau, Atkinson, and others (1997) for details).

However, individual decision trees generally do not often achieve state-of-the-art predictive accuracy.

Furthermore, we saw that deep trees tend to have high variance (and low bias) and shallow trees tend to be overly bias (but low variance).