Exploring residuals for classification models

  • Due their range (\([-1,1]\)) limitations, the residuals \(r_i\) are not very useful to explore the probability of observing \(y_i\).

  • If all explanatory variables are categorical with a limited number of categories the standard-normal approximation is likely if follow the following steps:

    • Divide the observed values in \(K\) groups sharing the same predicted value \(f_k\).
    • Average the residuals \(r_i\) per group and standardizing them with \(\sqrt{f_k(1-f_k)/n_k}\)