Standard errors
- Standard errors can be messed up in may ways because assumptions are violated
- We can account for this, though
- correlated errors change the sampling distribution: mean is swingier, larger standard deviation
Assumptions
- error term \(\epsilon\) is normally distributed –> OLS is mostly okay with this
- error term is independent and identically distributed (iid)
- autocorrelation: temporal/spatial
- heteroskedasticity
- we have to figure out how this assumption fails
Fixes (mostly sandwich estimators)
- heteroskedasticity: Huber-White
- auto-correlation: HAC, e.g. Newey-West
- geographic correlation: Conley spatial standard errors
- hierarchical structure: clustered standard errors, e.g. Liang-Zenger
- right level of clustering: treatment level/domain knowledge
- only works for large number of clusters, $ >50$; fix: wild cluster bootstrap standard errors
- bootstrapped standard errors
Bootstrapping
- start with data set with \(N\) observations
- randomly sample \(N\) observations (with replacement)
- estimate statistic
- repeat many times (a couple of 1,000)
- look at distribution of estimates
- can be used for any statistic
- need large samples
- don’t perform well with extreme value distributions
- doesn’t do well with autocorrelation