12.9 Data Set 2
In 2017, Cards Against Humanity Saves America launched a series of monthly surveys in order to get the “Pulse of the Nation”
- \(Y\): number of books somebody has read in the past year
- \(X_{1}\): age
- \(X_{2}\): whether they’d rather be wise but unhappy or happy but unwise
\[X_{2} = \begin{cases} 1 & \text{wise but unhappy} \\ 0 & \text{happy but unwise}\end{cases}\]
# Load data
data(pulse_of_the_nation)
<- pulse_of_the_nation %>%
pulse filter(books < 100) # avoid outliers
<- ggplot(pulse, aes(x = books)) +
p1 geom_histogram(color = "white")
<- ggplot(pulse, aes(y = books, x = age)) +
p2 geom_point()
<- ggplot(pulse, aes(y = books, x = wise_unwise)) +
p3 geom_boxplot()
# patchwork
+ p2 + p3 p1
12.9.1 Poisson Regression
Should we model books
with Poisson regression?
<- stan_glm(
books_poisson_sim ~ age + wise_unwise,
books data = pulse, family = poisson,
prior_intercept = normal(0, 2.5, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
chains = 4, iter = 5000*2, seed = 84735)