9.7 Posterior prediction

Quiz

Suppose a weather report indicates that tomorrow will be a 75-degree day in D.C. What’s your posterior guess of the number of riders that Capital Bikeshare should anticipate?

Answer:

One option is

\[ -2194.24 + 82.16 * 75 = 3967.76 \]

But this does not take into account :

  • Sampling variability

  • Posterior variability

9.7.1 Building a posterior predictive model

The posterior predictive model takes into account both kinds of variability. We can approximate this posterior predictive model with our 20 000 samples of parameters.

# Predict rides for each parameter set in the chain
set.seed(84735)
predict_75 <- bike_model_df %>% 
  mutate(mu = `(Intercept)` + temp_feel*75, # <- at 75 degrees
         y_new = rnorm(20000, mean = mu, sd = sigma)) # <- sampling var 
head(predict_75, 3)
  (Intercept) temp_feel sigma   mu y_new
1       -2657     88.16  1323 3955  4838
2       -2188     83.01  1323 4038  3874
3       -1984     81.54  1363 4132  5196

Interesting point is mu (\(\mu\)) vs. y_new (\(Y_{new}\)).

# Construct 80% posterior credible intervals
predict_75 %>% 
  summarize(lower_mu = quantile(mu, 0.025),
            upper_mu = quantile(mu, 0.975),
            lower_new = quantile(y_new, 0.025),
            upper_new = quantile(y_new, 0.975))
  lower_mu upper_mu lower_new upper_new
1     3843     4095      1500      6482
  • \(\mu\) is average in ridership for 75 degree

  • \(Y_\text{new}\) is for a specific day (with 75 degree)

\(\Rightarrow\) More accuracy in predicting an average than for an unique point!

9.7.2 Posterior with rstanarm

We have done it from “scratch” but we can use rstanarm::posterior_predict()

# Simulate a set of predictions
set.seed(84735)
shortcut_prediction <- 
  posterior_predict(bike_model, newdata = data.frame(temp_feel = 75))