The previous method may result in a tree that overfits the data. Why?
Tree is too leafy (complex)
A better strategy is to have a smaller tree with fewer splits, which will reduce variance and lead to better interpretation of results (at the cost of a little bias)
So we will prune