11.1 Assumptions of Regression Analysis
Validity (“rarely meet all (if any) of these criteria”)
- Model should include all relevant predictors
- Outcome should accurately reflect phenomenon of interest
- Model should generalize to cases to which it will apply
Representativesness (conditioned on predictors)
Additivity and Linearity
Independence of errors
Equal Variance of errors
Normality of errors (“typically barely important at all”- see exercises 11.3 and 11.6)
How to Deal With Failures of Assumptions
Extend model (e.g. measurement error models)
Change data or model, for example:
Failure of additivity: Transform the data
Failure of linearity: Transform predictors, add interactions
Non-representative: Add predictors
Change or restrict questions to align closer to the data.
Causal Inference
More assumptions are needed if regression is going to be given causal interpretation.
Example:
Causal: “Effect of a variable with all else held constant”, which would be an error for the earnings data! (Effect of increasing height on earnings?)
Non-causal: “Average difference in earnings comparing two people who differ by height”
11.1.0.1 Exercise 11.2: Descriptive and causal inference:
For the model in Section 7.1 predicting presidential vote share from the economy, describe the coefficient for economic growth in purely descriptive, non-causal terms.
Explain the difficulties of interpreting that coefficient as the effect of economic growth on the incumbent party’s vote share
More in part 4!