Exploring residuals for classification models
Due their range (\([-1,1]\)) limitations, the residuals \(r_i\) are not very useful to explore the probability of observing \(y_i\).
If all explanatory variables are categorical with a limited number of categories the standard-normal approximation is likely if follow the following steps:
- Divide the observed values in \(K\) groups sharing the same predicted value \(f_k\).
- Average the residuals \(r_i\) per group and standardizing them with \(\sqrt{f_k(1-f_k)/n_k}\)