9.7 - Final thoughts | Hands-On Machine Learning with R Book Club

9.7 - Final thoughts

Decision trees have a number of advantages:

They require very little pre-processing.
Can easily handle categorical features without preprocessing.
Missing values can be handled by decision trees by creating a new “missing” class for categorical variables or using surrogate splits (see Therneau, Atkinson, and others (1997) for details).

However, individual decision trees generally do not often achieve state-of-the-art predictive accuracy.

Furthermore, we saw that deep trees tend to have high variance (and low bias) and shallow trees tend to be overly bias (but low variance).