12.4 Poisson Regression
Yi|λi∼Pois(λi)
- expected value: E(Yi|λi)=λi
- variance: Var(Yi|λi)=λi
- does λi=β0+β1Xi1+β2Xi2+β3Xi3?
|>
equality ggplot(aes(x = percent_urban, y = laws, group = historical)) +
geom_smooth(aes(color = historical),
formula = "y ~ x",
linewidth = 3,
method = "lm",
se = FALSE) +
labs(title = "Anti-Discrimination Laws",
subtitle = "Human Rights Campaign State Equality Index",
caption = "R4DS Bayes Rules book club") +
scale_color_manual(values = c("blue", "red", "purple")) +
theme_minimal() +
xlim(0, 100)
- observe that some of the predicted counts (for number of laws) are negative!
12.4.1 Log-Link Function
Yi|β0,β1,β2,β3,σ∼Pois(λi)
with
log(λi)=β0+β1Xi1+β2Xi2+β3Xi3 or λi=eβ0+β1Xi1+β2Xi2+β3Xi3
12.4.2 rstan
<- stan_glm(laws ~ percent_urban + historical,
equality_model_prior data = equality,
family = poisson,
prior_intercept = normal(2, 0.5),
prior = normal(0, 2.5, autoscale = TRUE),
chains = 4, iter = 5000*2, seed = 84735,
prior_PD = TRUE)
12.4.3 Poisson Regression Assumptions
- Structure of the data: Conditioned on predictors X, the observed data Yi on case i is independent of the observed data on any other case j.
- Structure of the variable Y: Response variable Y has a Poisson structure, i.e., is a discrete count of events that happen in a fixed interval of space or time.
- Structure of the relationship: The logged average Y value can be written as a linear combination of the predictors log(λi)=β0+β1Xi1+β2Xi2+β3Xi3
- Structure of the variability in Y: A Poisson random variable Y with rate λ has equal mean and variance, E(Y)=Var(Y)=λ. Thus, conditioned on predictors X, the typical value of Y should be roughly equivalent to the variability in Y. As such, the variability in Y increases as its mean increases.