13.18 Collinearity

A tempting thought when you have multiple measures of the same concept is that including them all in the model will in some way let them complement each other, or add up to an effect. But in reality it just forces each of them to show the effect of the variation that each measure has that’s unrelated to the other measures.

  • happens e.g. when including variables measuring the same latent variable
  • super highly correlated variables drive standard errors upwards

Addressing this

  • dimension reduction: e.g. latent factor analysis, PCA
  • variance inflation factor
  • \(VIF_j = \frac{1}{1-R^2_j}\)
  • exclude variable if \(VIF > 10\)