Big picture

Learning objectives:

  • Become familiar with some metaprogramming principals and how they relate to each other
  • Review vocabulary associated with metaprogramming
library(rlang)
library(lobstr)

Code is data

  • expression - Captured code (call, symbol, constant, or pairlist)
  • Use rlang::expr()1 to capture code directly
expr(mean(x, na.rm = TRUE))
#> mean(x, na.rm = TRUE)
  • Use rlang::enexpr() to capture code indirectly
capture_it <- function(x) { # 'automatically quotes first argument'
  enexpr(x)
}
capture_it(a + b + c)
#> a + b + c
  • ‘Captured’ code can be modified (like a list)!
    • First element is the function, next elements are the arguments
f <- expr(f(x = 1, y = 2))
names(f)
#> [1] ""  "x" "y"
ff <- fff <- f   # Create two copies

ff$z <- 3        # Add an argument to one
fff[[2]] <- NULL # Remove an argument from another

f
#> f(x = 1, y = 2)
ff
#> f(x = 1, y = 2, z = 3)
fff
#> f(y = 2)

More on this next week!

Code is a tree

  • Abstract syntax tree (AST) - Almost every language represents code as a tree
  • Use lobstr::ast() to inspect these code trees
ast(f1(f2(a, b), f3(1)))
#> █─f1 
#> ├─█─f2 
#> │ ├─a 
#> │ └─b 
#> └─█─f3 
#>   └─1
ast(1 + 2 * 3)
#> █─`+` 
#> ├─1 
#> └─█─`*` 
#>   ├─2 
#>   └─3

Code can generate code

  • rlang::call2() creates function call
call2("f", 1, 2, 3)
#> f(1, 2, 3)
  • Going backwards from the tree, can use functions to create calls
call2("f1", call2("f2", "a", "b"), call2("f3", 1))
#> f1(f2("a", "b"), f3(1))
call2("+", 1, call2("*", 2, 3))
#> 1 + 2 * 3
  • !! bang-bang - unquote operator
    • inserts previously defined expressions into the current one
xx <- expr(x + x)
yy <- expr(y + y)
expr(xx / yy)     # Nope!
#> xx/yy
expr(!!xx / !!yy) # Yup!
#> (x + x)/(y + y)
cv <- function(var) {
  var <- enexpr(var)            # Get user's expression
  expr(sd(!!var) / mean(!!var)) # Insert user's expression
}

cv(x)
#> sd(x)/mean(x)
cv(x + y)
#> sd(x + y)/mean(x + y)
  • Avoid paste() for building code
    • Problems with non-syntactic names and precedence among expressions

“You might think this is an esoteric concern, but not worrying about it when generating SQL code in web applications led to SQL injection attacks that have collectively cost billions of dollars.”

Evaluation runs code

  • evaluate - run/execute an expression
  • need both expression and environment
  • eval() uses current environment if not set
  • manual evaluation means you can tweak the environment!
xy <- expr(x + y)

eval(xy, env(x = 1, y = 10))
#> [1] 11
eval(xy, env(x = 2, y = 100))
#> [1] 102

Customizing evaluations with functions

  • Can also bind names to functions in supplied environment
  • Allows overriding function behaviour
  • This is how dplyr generates SQL for working with databases

For example…

string_math <- function(x) {
  e <- env(
    caller_env(),
    `+` = function(x, y) paste(x, y),
    `*` = function(x, y) strrep(x, y)
  )

  eval(enexpr(x), e)
}

cohort <- 9
string_math("Hello" + "cohort" + cohort)
#> [1] "Hello cohort 9"
string_math(("dslc" + "is" + "awesome---") * cohort)
#> [1] "dslc is awesome---dslc is awesome---dslc is awesome---dslc is awesome---dslc is awesome---dslc is awesome---dslc is awesome---dslc is awesome---dslc is awesome---"

Customizing evaluation with data

  • Look for variables inside data frame
  • Data mask - typically a data frame
  • use rlang::eval_tidy() rather than eval()
df <- data.frame(x = 1:5, y = sample(5))
eval_tidy(expr(x + y), df)
#> [1] 2 4 6 9 9

Catch user input with enexpr()

with2 <- function(df, expr) {
  eval_tidy(enexpr(expr), df)
}

with2(df, x + y)
#> [1] 2 4 6 9 9

But there’s a bug!

  • Evaluates in environment inside with2(), but the expression likely refers to objects in the Global environment
with2 <- function(df, expr) {
  a <- 1000
  eval_tidy(enexpr(expr), df)
}

df <- data.frame(x = 1:3)
a <- 10
with2(df, x + a)
#> [1] 1001 1002 1003
  • Solved with Quosures…

Quosures

  • Quosures bundles expression with an environment
  • Use enquo() instead of enexpr() (with eval_tidy())
with2 <- function(df, expr) {
  a <- 1000
  eval_tidy(enquo(expr), df)
}

df <- data.frame(x = 1:3)
a <- 10
with2(df, x + a)
#> [1] 11 12 13

“Whenever you use a data mask, you must always use enquo() instead of enexpr().

This comes back in Chapter 20.

Which environment is bundled?

  • The environment where the expression is created (i.e. the parent of where enquo() is called)

Here, the global environment

with2 <- function(df, expr) {
  a <- 1000
  eq <- enquo(expr)
  message("with2() Parent/Calling environment: ")
  print(rlang::caller_env())
  message("with2() environment: ")
  print(rlang::current_env())
  message("Quosure details: ")
  print(eq)  # Print the details of the quosure
  eval_tidy(eq, df)
}

a <- 10000
df <- data.frame(x = 1:3)
with2(df, x + a)
#> with2() Parent/Calling environment:
#> <environment: R_GlobalEnv>
#> with2() environment:
#> <environment: 0x0000018dede5ddd0>
#> Quosure details:
#> <quosure>
#> expr: ^x + a
#> env:  global
#> [1] 10001 10002 10003

Here, the fun1() environment

fun1 <- function(df) {
  a <- 10
  message("fun1() Parent/Calling environment: ")
  print(rlang::caller_env())
  message("fun1() environment: ")
  print(rlang::current_env())
  with2(df, x + a)
}

a <- 10000
df <- data.frame(x = 1:3)
fun1(df)
#> fun1() Parent/Calling environment:
#> <environment: R_GlobalEnv>
#> fun1() environment:
#> <environment: 0x0000018df3e748f8>
#> with2() Parent/Calling environment:
#> <environment: 0x0000018df3e748f8>
#> with2() environment:
#> <environment: 0x0000018df3ebc698>
#> Quosure details:
#> <quosure>
#> expr: ^x + a
#> env:  0x0000018df3e748f8
#> [1] 11 12 13