Example comparing maximum likelihood and Bayesian inference

  • Simulation study, \(p = \text{logit}^{-1}(a + b x)\).

  • Assume that we expect b to fall approximately between 0 and 1 and impose a prior with mean 0.5 and standard devation 0.5. Stick with default prior on intercept

  • This example is adapted from the tidyros repo! https://github.com/behrman/ros

bayes_sim <- function(n, a = -2, b = 0.8) {
  data <- 
    tibble(
      x = runif(n, min = -1, max = 1),
      y = if_else(0 < rlogis(n, location = a + b * x, scale = 1), 1, 0)
    )
  
  fit_glm <- glm(y ~ x, family = binomial(link = "logit"), data = data)
  fit_stan <- 
    stan_glm(
      y ~ x,
      family = binomial(link = "logit"),
      data = data,
      refresh = 0,
      prior = normal(location = 0.5, scale = 0.5)
    )
  
  arm::display(fit_glm, digits = 1)
  cat("\n")
  print(fit_stan, digits = 1)
}

We next simulate for a range of sample sizes, each time focusing on inference about b, for which a value of 0.8 was used to generate the data.

n = 10

set.seed(363852)

bayes_sim(10)
## glm(formula = y ~ x, family = binomial(link = "logit"), data = data)
##             coef.est coef.se
## (Intercept) -1.9      1.2   
## x            1.5      1.7   
## ---
##   n = 10, k = 2
##   residual deviance = 8.9, null deviance = 10.0 (difference = 1.1)
## 
## stan_glm
##  family:       binomial [logit]
##  formula:      y ~ x
##  observations: 10
##  predictors:   2
## ------
##             Median MAD_SD
## (Intercept) -1.5    0.8  
## x            0.6    0.4  
## 
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
  • Focus on the coefficient of x, which represents the parameter of interest in this hypothetical study.

  • glm() gives a maximum likelihood estimate of 1.5 with large error, indicating little info in 10 data points

  • stan_glm() estimate is influenced by the prior. 0.6 is close to the prior mean of 0.5, being pulled away by the data only slightly.

n = 100

bayes_sim(100)
## glm(formula = y ~ x, family = binomial(link = "logit"), data = data)
##             coef.est coef.se
## (Intercept) -2.7      0.4   
## x            1.2      0.7   
## ---
##   n = 100, k = 2
##   residual deviance = 47.7, null deviance = 50.7 (difference = 3.0)
## 
## stan_glm
##  family:       binomial [logit]
##  formula:      y ~ x
##  observations: 100
##  predictors:   2
## ------
##             Median MAD_SD
## (Intercept) -2.6    0.4  
## x            0.7    0.4  
## 
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
  • The maximum likelihood estimate is again extreme, but less so.

  • Bayes estimate is again pulled toward the prior mean of 0.5, but less so than before.

n = 1000

bayes_sim(1000)
## glm(formula = y ~ x, family = binomial(link = "logit"), data = data)
##             coef.est coef.se
## (Intercept) -2.1      0.1   
## x            1.0      0.2   
## ---
##   n = 1000, k = 2
##   residual deviance = 723.8, null deviance = 757.4 (difference = 33.6)
## 
## stan_glm
##  family:       binomial [logit]
##  formula:      y ~ x
##  observations: 1000
##  predictors:   2
## ------
##             Median MAD_SD
## (Intercept) -2.0    0.1  
## x            0.9    0.2  
## 
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
  • At n=1000, the prior no longer has much if any impact.