Case study 1: Gibbs sampler

The Gibbs sampler is a method for estimating parameters expectations. It is a MCMC algorithm that has been adapted to sample from multidimensional target distributions. Gibbs sampling generates a Markov chain of samples, each of which is correlated with nearby samples.

Example blogged by Dirk Eddelbuettel, the R and C++ code is very similar but runs about 20 times faster.

“Darren Wilkinson stresses the rather pragmatic aspects of how fast and/or easy it is to write the code, rather than just the mere runtime.

R code:

gibbs_r <- function(N, thin) {
  mat <- matrix(nrow = N, ncol = 2)
  x <- y <- 0

  for (i in 1:N) {
    for (j in 1:thin) {
      x <- rgamma(1, 3, y * y + 4)
      y <- rnorm(1, 1 / (x + 1), 1 / sqrt(2 * (x + 1)))
    }
    mat[i, ] <- c(x, y)
  }
  mat
}

Actions to convert R to C++:

  • Add type declarations to all variables
  • Use ( instead of [ to index into the matrix
  • Subscript the results of rgamma and rnorm to convert from a vector into a scalar.
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericMatrix gibbs_cpp(int N, int thin) {
  NumericMatrix mat(N, 2);
  double x = 0, y = 0;

  for(int i = 0; i < N; i++) {
    for(int j = 0; j < thin; j++) {
      x = rgamma(1, 3, 1 / (y * y + 4))[0];
      y = rnorm(1, 1 / (x + 1), 1 / sqrt(2 * (x + 1)))[0];
    }
    mat(i, 0) = x;
    mat(i, 1) = y;
  }

  return(mat);
}

Checking who’s best:

bench::mark(
  gibbs_r(100, 10),
  gibbs_cpp(100, 10),
  check = FALSE
)
#> # A tibble: 2 × 6
#>   expression              min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>         <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 gibbs_r(100, 10)     2.54ms    2.6ms      383.    98.4KB     6.31
#> 2 gibbs_cpp(100, 10) 246.61µs    264µs     3752.    1.61KB    10.5