Case study 1: Gibbs sampler
The Gibbs sampler is a method for estimating parameters expectations. It is a MCMC algorithm that has been adapted to sample from multidimensional target distributions. Gibbs sampling generates a Markov chain of samples, each of which is correlated with nearby samples.
Example blogged by Dirk Eddelbuettel, the R and C++ code is very similar but runs about 20 times faster.
“Darren Wilkinson stresses the rather pragmatic aspects of how fast and/or easy it is to write the code, rather than just the mere runtime.
R code:
gibbs_r <- function(N, thin) {
mat <- matrix(nrow = N, ncol = 2)
x <- y <- 0
for (i in 1:N) {
for (j in 1:thin) {
x <- rgamma(1, 3, y * y + 4)
y <- rnorm(1, 1 / (x + 1), 1 / sqrt(2 * (x + 1)))
}
mat[i, ] <- c(x, y)
}
mat
}
Actions to convert R to C++:
- Add type declarations to all variables
- Use
(
instead of[
to index into the matrix - Subscript the results of
rgamma
andrnorm
to convert from a vector into a scalar.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix gibbs_cpp(int N, int thin) {
NumericMatrix mat(N, 2);
double x = 0, y = 0;
for(int i = 0; i < N; i++) {
for(int j = 0; j < thin; j++) {
x = rgamma(1, 3, 1 / (y * y + 4))[0];
y = rnorm(1, 1 / (x + 1), 1 / sqrt(2 * (x + 1)))[0];
}
mat(i, 0) = x;
mat(i, 1) = y;
}
return(mat);
}
Checking who’s best:
bench::mark(
gibbs_r(100, 10),
gibbs_cpp(100, 10),
check = FALSE
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 gibbs_r(100, 10) 2.54ms 2.6ms 383. 98.4KB 6.31
#> 2 gibbs_cpp(100, 10) 246.61µs 264µs 3752. 1.61KB 10.5