9.7 Posterior prediction

Quiz

Suppose a weather report indicates that tomorrow will be a 75-degree day in D.C. What’s your posterior guess of the number of riders that Capital Bikeshare should anticipate?

Answer:

One option is

$-2194.24 + 82.16 * 75 = 3967.76$

But this does not take into account :

Sampling variability
Posterior variability

9.7.1 Building a posterior predictive model

The posterior predictive model takes into account both kinds of variability. We can approximate this posterior predictive model with our 20 000 samples of parameters.

# Predict rides for each parameter set in the chain
set.seed(84735)
predict_75 <- bike_model_df %>% 
  mutate(mu = `(Intercept)` + temp_feel*75, # <- at 75 degrees
         y_new = rnorm(20000, mean = mu, sd = sigma)) # <- sampling var 
head(predict_75, 3)
  (Intercept) temp_feel sigma   mu y_new
1       -2657     88.16  1323 3955  4838
2       -2188     83.01  1323 4038  3874
3       -1984     81.54  1363 4132  5196

Interesting point is mu ( $\mu$ ) vs. y_new ( $Y_{new}$ ).

# Construct 80% posterior credible intervals
predict_75 %>% 
  summarize(lower_mu = quantile(mu, 0.025),
            upper_mu = quantile(mu, 0.975),
            lower_new = quantile(y_new, 0.025),
            upper_new = quantile(y_new, 0.975))
  lower_mu upper_mu lower_new upper_new
1     3843     4095      1500      6482

$\mu$ is average in ridership for 75 degree
$Y_\text{new}$ is for a specific day (with 75 degree)

$\Rightarrow$ More accuracy in predicting an average than for an unique point!

9.7.2 Posterior with rstanarm

We have done it from “scratch” but we can use rstanarm::posterior_predict()

# Simulate a set of predictions
set.seed(84735)
shortcut_prediction <- 
  posterior_predict(bike_model, newdata = data.frame(temp_feel = 75))