Chapter 19 - Quasiquotation

Introduction

When used alone, quasiquotation is most useful for programming, particularly for generating code. But when it’s combined with the other techniques, tidy evaluation becomes a powerful tool for data analysis.

library(rlang)
library(purrr)
## 
## Attaching package: 'purrr'
## The following objects are masked from 'package:rlang':
## 
##     %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
##     flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
##     splice
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

19.2 Motivation

Let’s see if Hadley’s motivation works for us!

paste("Good", "morning", "Hadley")
## [1] "Good morning Hadley"
paste("Good", "afternoon", "Alice")
## [1] "Good afternoon Alice"

When using cement() every argument is automatically quoted:

cement <- function(...) {
  args <- ensyms(...)
  paste(purrr::map(args, as_string), collapse = " ")
}

cement(Good, morning, Hadley)
## [1] "Good morning Hadley"
cement(Good, afternoon, Alice)
## [1] "Good afternoon Alice"

The problem comes when we want to use variables:

name <- "Hadley"
time <- "morning"

paste("Good", time, name)
## [1] "Good morning Hadley"
cement(Good, time, name)
## [1] "Good time name"

The tool of choice is “unquote” !!, pronounced “bang-bang”

cement(Good, !!time, !!name)
## [1] "Good morning Hadley"

paste() evaluates its arguments, so we must quote where needed; cement() quotes its arguments, so we must unquote where needed.

Another motivation

var <- "new_name"
mtcars %>% rename(!!var := mpg) %>% head()

Unfortunately R is very strict about the kind of expressions supported on the LHS of =. This is why rlang interprets the walrus operator := as an alias of =. You can use it to supply names, e.g. a := b is equivalent to a = b. Since its syntax is more flexible you can also force names on its LHS:

19.2.1 Vocabulary

  • An evaluated argument obeys R’s usual evaluation rules.

  • A quoted argument is captured by the function, and is processed in some custom way.

19.2.2 Exercises

  1. For each function in the following base R code, identify which arguments are quoted and which are evaluated.
library(MASS)

mtcars2 <- subset(mtcars, cyl == 4)

with(mtcars2, sum(vs))
sum(mtcars2$am)

rm(mtcars2)

  1. For each function in the following base R code, identify which arguments are quoted and which are evaluated.
library(MASS) # MASS is quoted

mtcars2 <- subset(mtcars, cyl == 4) # mtcars is evaluated, cyl is quoted
mtcars
cyl

with(mtcars2, sum(vs)) # mtcars2 is evaluated, vs is quoted
mtcars2
vs

sum(mtcars2$am) # mtcars2$am is evaluated
mtcars2$am



rm(mtcars2) # mtcars2 is evaluated?? 

  1. For each function in the following tidyverse code, identify which arguments are quoted and which are evaluated.
library(dplyr)
library(ggplot2)

by_cyl <- mtcars %>%
  group_by(cyl) %>%
  summarise(mean = mean(mpg))

ggplot(by_cyl, aes(cyl, mean)) + geom_point()

  1. For each function in the following tidyverse code, identify which arguments are quoted and which are evaluated.
library(dplyr) # dplyr is quoted
dplyr

library(ggplot2) # ggplot2 is quoted

by_cyl <- mtcars %>%
  group_by(cyl) %>%
  summarise(mean = mean(mpg))

# mtcars is evaluated, cyl is quoted, mpg is quoted

ggplot(by_cyl, aes(cyl, mean)) + geom_point()
# by_cyl is evaluated, cyl & mean are quoted

19.3 Quoting

The first part of quasiquotation is quotation: capturing an expression without evaluating it.

(rlang functions):

Capturing expressions

expr(x + y)
## x + y
expr(1 / 2 / 3)
## 1/2/3

expr(): Capturing an argument exactly as provided. Not so useful inside of a function:

f1 <- function(x) expr(x)
f1(a + b + c)
## x

For that purpose you can use: enexpr():

f2 <- function(x) enexpr(x)
f2(a + b + c)
## a + b + c

To capture multiple arguments in ... use enexprs():

f <- function(...) enexprs(...)
f(x = 1, y = 10 * z)
## $x
## [1] 1
## 
## $y
## 10 * z
exprs(x = x ^ 2, y = y ^ 3, z = z ^ 4)
## $x
## x^2
## 
## $y
## y^3
## 
## $z
## z^4
# shorthand for
# list(x = expr(x ^ 2), y = expr(y ^ 3), z = expr(z ^ 4))

Capturing symbols

ensym() and ensyms() check that the captured expression is either a symbol or a string (which is converted to a symbol):

f <- function(...) ensyms(...)
f(x)
## [[1]]
## x
#> [[1]]
#> x
f("x")
## [[1]]
## x
#> [[1]]
#> x

19.3.5 Summary

The base equivalents do not support unquoting.

rlang quasiquoting functions:

Developer User
One expr() enexpr()
Many exprs() enexprs()

base R quoting functions:

Developer User
One quote() substitute()
Many alist() as.list(substitute(...()))

19.3.6 Exercises

  1. How is expr() implemented? Look at its source code.

  1. How is expr() implemented? Look at its source code.
expr

  1. Compare and contrast the following two functions. Can you predict the output before running them?
f1 <- function(x, y) {
  exprs(x = x, y = y)
}
f2 <- function(x, y) {
  enexprs(x = x, y = y)
}
f1(a + b, c + d)
f2(a + b, c + d)

  1. What happens if you try to use enexpr() with an expression (i.e. enexpr(x + y) ? What happens if enexpr() is passed a missing argument?

  1. What happens if you try to use enexpr() with an expression (i.e. enexpr(x + y) ? What happens if enexpr() is passed a missing argument?
enexpr(x + y)
## Error: `arg` must be a symbol

arg must be a symbol

enexpr(missing_arg())
## Error: `arg` must be a symbol
is.symbol(missing_arg())
## [1] TRUE

  1. How are exprs(a) and exprs(a = ) different? Think about both the input and the output.

  1. How are exprs(a) and exprs(a = ) different? Think about both the input and the output.
exprs(a)
## [[1]]
## a
exprs(a = )
## $a

exprs(a)

input: symbol

output: list entry value

exprs(a = )

input: name of argument

output: list entry name

  1. What are other differences between exprs() and alist()? Read the documentation for the named arguments of exprs() to find out.

  1. The documentation for substitute() says:

Substitution takes place by examining each component of the parse tree as follows:

  • If it is not a bound symbol in env, it is unchanged.
  • If it is a promise object (i.e., a formal argument to a function) the expression slot of the promise replaces the symbol.
  • If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.

Create examples that illustrate each of the above cases.

19.4 Unquoting

Unquoting is one inverse of quoting. It allows you to selectively evaluate code inside expr(), so that expr(!!x) is equivalent to x.

Unqouting one argument

!! works with call objects:

x <- expr(-1)
expr(f(!!x, y))
## f(-1, y)
lobstr::ast(expr(f(!!x, y)))
## █─expr 
## └─█─f 
##   ├─█─`-` 
##   │ └─1 
##   └─y

… as well as for symbols and constants:

a <- sym("y")
b <- 1
expr(f(!!a, !!b))
## f(y, 1)
lobstr::ast(expr(f(!!a, !!b)))
## █─expr 
## └─█─f 
##   ├─y 
##   └─1

You can also use !! with function calls. The function call will be evaluated and the result will be inserted:

mean_rm <- function(var) {
  var <- ensym(var)
  expr(mean(!!var, na.rm = TRUE))
}
expr(!!mean_rm(x) + !!mean_rm(y))
## mean(x, na.rm = TRUE) + mean(y, na.rm = TRUE)

!! preserves operator precedence:

x1 <- expr(x + 1)
x2 <- expr(x + 2)

expr(!!x1 / !!x2)
## (x + 1)/(x + 2)
lobstr::ast(expr(!!x1 / !!x2))
## █─expr 
## └─█─`/` 
##   ├─█─`+` 
##   │ ├─x 
##   │ └─1 
##   └─█─`+` 
##     ├─x 
##     └─2
lobstr::ast(expr(x+1/x+2))
## █─expr 
## └─█─`+` 
##   ├─█─`+` 
##   │ ├─x 
##   │ └─█─`/` 
##   │   ├─1 
##   │   └─x 
##   └─2

Unquoting a function

f <- expr(foo)
expr((!!f)(x, y))
## foo(x, y)
lobstr::ast(expr((!!f)(x, y)))
## █─expr 
## └─█─foo 
##   ├─x 
##   └─y

Because of the large number of parentheses involved, it can be clearer to use rlang::call2():

f <- expr(foo)
call2(f, expr(x), expr(y))
## foo(x, y)

Unquoting a missing argument

We talked about missing arguments in Section 18.6.2.

arg <- missing_arg()
expr(foo(!!arg, !!arg))
## Error in enexpr(expr): argument "arg" is missing, with no default

Here we can use the helper function maybe_missing():

expr(foo(!!maybe_missing(arg), !!maybe_missing(arg)))
## foo(, )

Unquoting in special forms

can be problematic, for example with $:

expr(df$!!x)
## Error: <text>:1:9: unexpected '!'
## 1: expr(df$!
##             ^
x <- expr(x)
expr(`$`(df, !!x))
## df$x

Unquoting many arguments

!!! (called unquote-splice, pronounced bang-bang-bang) takes a list of expressions and inserts them at the location of the !!!:

xs <- exprs(1, a, -b)
expr(f(!!!xs, y))
## f(1, a, -b, y)
lobstr::ast(f(!!!xs, y))
## █─f 
## ├─1 
## ├─a 
## ├─█─`-` 
## │ └─b 
## └─y

The polite fiction of !!

When used outside of an expression context !! and !!! actually just work as repeated application of !

!!TRUE
## [1] TRUE
!!!TRUE
## [1] FALSE
bang <- 1000
!!bang
## [1] TRUE
!!!bang
## [1] FALSE

That means you can get argument type errors:

x <- quote(variable)
!!x
## Error in !x: invalid argument type
!x
## Error in !x: invalid argument type

..or silently incorrect results:

df <- data.frame(x = 1:5)
boom <- 100
with(df, x + !!boom)
## [1] 2 3 4 5 6
!!boom
## [1] TRUE
as.numeric(!!boom)
## [1] 1

Non-standard ASs or some more things to confuse us:

For example, if you inline more complex objects, their attributes are not printed. This can lead to confusing output:

x1 <- expr(class(!!data.frame(x = 10)))
x1
## class(list(x = 10))
eval(x1)
## [1] "data.frame"

Two tools to reduce confusion:

expr_print(x1)
## class(<df[,1]>)
lobstr::ast(!!x1)
## █─class 
## └─<inline data.frame>

Another example: Inlining integer sequences:

x2 <- expr(f(!!c(1L, 2L, 3L, 4L, 5L)))
x2
## f(1:5)
expr_print(x2)
## f(<int: 1L, 2L, 3L, 4L, 5L>)
lobstr::ast(!!x2)
## █─f 
## └─<inline integer>

19.4.8 Exercises

  1. Given the following components:
xy <- expr(x + y)
xz <- expr(x + z)
yz <- expr(y + z)
abc <- exprs(a, b, c)

Use quasiquotation to construct the following calls:

(x + y) / (y + z)
-(x + z) ^ (y + z)
(x + y) + (y + z) - (x + y)
atan2(x + y, y + z)
sum(x + y, x + y, y + z)
sum(a, b, c)
mean(c(a, b, c), na.rm = TRUE)
foo(a = x + y, b = y + z)

xy <- expr(x + y)
xz <- expr(x + z)
yz <- expr(y + z)
abc <- exprs(a, b, c)

(x + y) / (y + z)

expr(!!xy/!!yz)
## (x + y)/(y + z)

-(x + z) ^ (y + z)

expr(-(!!xz)^(!!yz)) 
## -(x + z)^(y + z)

(x + y) + (y + z) - (x + y) ?

expr(!!xy + !!yz-!!xy)
## x + y + (y + z) - (x + y)

atan2(x + y, y + z)

expr(atan2(!!xy, !!yz))
## atan2(x + y, y + z)

sum(x + y, x + y, y + z)

expr(sum(!!xy, !!xy, !!yz))
## sum(x + y, x + y, y + z)

mean(c(a, b, c), na.rm = TRUE) !!!

expr(mean(c(!!!abc), na.rm=TRUE))
## mean(c(a, b, c), na.rm = TRUE)

foo(a = x + y, b = y + z)

expr(foo(a=!!xy, b=!!yz))
## foo(a = x + y, b = y + z)

  1. The following two calls print the same, but are actually different:
(a <- expr(mean(1:10)))
#> mean(1:10)
(b <- expr(mean(!!(1:10))))
#> mean(1:10)
identical(a, b)
#> [1] FALSE

What’s the difference? Which one is more natural?

(a <- expr(mean(1:10)))
## mean(1:10)
#> mean(1:10)
(b <- expr(mean(!!(1:10))))
## mean(1:10)
#> mean(1:10)
identical(a, b)
## [1] FALSE
expr_print(a)
## mean(1:10)
lobstr::ast(expr(mean(1:10)))
## █─expr 
## └─█─mean 
##   └─█─`:` 
##     ├─1 
##     └─10
expr_print(b)
## mean(<int: 1L, 2L, 3L, 4L, 5L, ...>)
lobstr::ast(expr(mean(!!(1:10))))
## █─expr 
## └─█─mean 
##   └─<inline integer>

To me personally the first one feels more natural, as it prints the expression that I would normally write.

Followup

Tyler Grant Smith provided an answer to our question for exercise 19.4.8.-1.

Given the following components:

xy <- expr(x + y)
xz <- expr(x + z)
yz <- expr(y + z)

use quasiquotation to construct the following calls:

(x + y) + (y + z) - (x + y)

expr(((!!xy)) + !!yz-!!xy)
## (x + y) + (y + z) - (x + y)

19.5 Non-quoting

There is one quasiquotation couple in base R:

bquote()for quoting and .() for “unquoting”.

xyz <- bquote((x + y + z))
bquote(-.(xyz) / 2)
## -(x + y + z)/2

But bquote() is rarely used by any other function written in R.

Base functions that quote an argument use some other technique to allow indirect specification. Base R approaches selectively turn quoting off, rather than using unquoting, so I call them non-quoting techniques.

There are four basic forms of non-quoting used in base R functions:

  1. A pair of quoting/non-quoting functions.

quoting $ - non-quoting [[]]:

x <- list(var = 1, y = 2)
var <- "y"

x$var # quoted
## [1] 1
x[[var]] # takes variable as a string
## [1] 2

Other examples: <-/assign() and ::/getExportedValue()

  1. A pair of quoting/non-quoting arguments

rm() allows you to provide bare variable names in ..., or a character vector of variable names in list:

x <- 1
rm(x)

y <- 2
vars <- c("y", "vars")
rm(list = vars)

Other examples: data() and save()

  1. An argument that controls whether a different argument is quoting or non-quoting.
library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
pkg <- "MASS"
library(pkg, character.only = TRUE)

Other examples: demo(), detach(), example(), and require()

  1. Quoting if evaluation fails
# Shows help for var
help(var)
## No documentation for 'y' in specified packages and libraries:
## you could try '??y'
var <- "mean"
# Shows help for mean
help(var)

var <- 10
# Shows help for var
help(var)

Other examples: ls(), page(), and match.fun()

19.6 ... (dot-dot-dot)

Motivation!

  • What do we do if we want to rbind() a list of data frames together?
dfs <- list(
  a = data.frame(x = 1, y = 2),
  b = data.frame(x = 3, y = 4)
)

We could do: rbind(dfs$a, dfs$b)

or:

dplyr::bind_rows(!!!dfs)

with base R, we can use the do.call() function:

do.call("rbind", dfs)
  • What do we do, if we want to supply an argument name indirectly?
var <- "x"
val <- c(4, 3, 9)

We could do: setNames(data.frame(val), var)

or:

tibble::tibble(!!var := val)

with base R:

args <- list(val)
names(args) <- var

do.call("data.frame", args)

… Functions that support these tools, without quoting arguments, have tidy dots. To gain tidy dots behaviour in your own function, all you need to do is use list2().

?list2 –> dynamic dots

The ... syntax of base R allows you to:

  • Forward arguments from function to function, matching them along the way to function parameters.
  • Collect arguments inside data structures, e.g. with c() or list().

Dynamic dots offer a few additional features:

  1. You can splice arguments saved in a list with the big bang operator !!!.
  2. You can unquote names by using the bang bang operator !! on the left-hand side of :=.
  3. Trailing commas are ignored, making it easier to copy and paste lines of arguments.
tibble::tibble(
  y = 1:5,
  z = 3:-1,
  x = 5:1,
)

Example (how you could use list2())

set_attr <- function(.x, ...) {
  attr <- rlang::list2(...)
  attributes(.x) <- attr
  .x
}

attrs <- list(x = 1, y = 2)
attr_name <- "z"

1:10 %>%
  set_attr(w = 0, !!!attrs, !!attr_name := 3) %>% 
  str()
##  int [1:10] 1 2 3 4 5 6 7 8 9 10
##  - attr(*, "w")= num 0
##  - attr(*, "x")= num 1
##  - attr(*, "y")= num 2
##  - attr(*, "z")= num 3

exec()

# Directly
exec("mean", x = 1:10, na.rm = TRUE, trim = 0.1)
## [1] 5.5
# Indirectly
args <- list(x = 1:10, na.rm = TRUE, trim = 0.1)
exec("mean", !!!args)
## [1] 5.5
# Mixed
params <- list(na.rm = TRUE, trim = 0.1)
exec("mean", x = 1:10, !!!params)
## [1] 5.5
mean(!!!args)
## Error in !args: invalid argument type

Supply argument names indirectly:

arg_name <- "na.rm"
arg_val <- TRUE
exec("mean", 1:10, !!arg_name := arg_val)
## [1] 5.5

Call multiple functions with the same arguments:

x <- c(runif(10), NA)
funs <- c("mean", "median", "sd")

purrr::map_dbl(funs, exec, x, na.rm = TRUE)
## [1] 0.5551995 0.5679720 0.2362938

Exercises

  1. One way to implement exec() is shown below. Describe how it works. What are the key ideas?
exec <- function(f, ..., .env = caller_env()) {
  args <- list2(...)
  do.call(f, args, envir = .env)
}

adv. R solutions:

exec() takes a function (f), its arguments (...) and an environment (.env) as input. This allows to construct a call from f and ... and evaluate this call in the supplied environment. As the ... argument is handled via list2(), exec() supports tidy dots (quasiquotation), which means that arguments and names (on the left-hand side of :=) can be unquoted via !! and !!!.

  1. Carefully read the source code for interaction(), expand.grid(), and par(). Compare and contrast the techniques they use for switching between dots and list behaviour.

  1. Explain the problem with this definition of set_attr()
set_attr <- function(x, ...) {
  attr <- rlang::list2(...)
  attributes(x) <- attr
  x
}
set_attr(1:10, x = 10)
## Error in attributes(x) <- attr: attributes must be named

19.7 Case studies

lobstr::ast()

z <- expr(foo(x, y))
lobstr::ast(z)
## z
lobstr::ast(!!z)
## █─foo 
## ├─x 
## └─y

Map-reduce to generate code

On the example of generating the expression of a linear model with following coefficients.

intercept <- 10
coefs <- c(x1 = 5, x2 = -4)
coef_sym <- syms(names(coefs))
coef_sym
## [[1]]
## x1
## 
## [[2]]
## x2
summands <- map2(coef_sym, coefs, ~ expr((!!.x * !!.y)))
summands
## [[1]]
## (x1 * 5)
## 
## [[2]]
## (x2 * -4)
summands <- c(intercept, summands)
summands
## [[1]]
## [1] 10
## 
## [[2]]
## (x1 * 5)
## 
## [[3]]
## (x2 * -4)
eq <- reduce(summands, ~ expr(!!.x + !!.y))
eq
## 10 + (x1 * 5) + (x2 * -4)

Creating functions

new_function(
  exprs(x = , y = ), 
  expr({x + y})
)
## function (x, y) 
## {
##     x + y
## }

Raise a number to the power of a number:

power <- function(exponent) {
  new_function(
    exprs(x = ), 
    expr({
      x ^ !!exponent
    }), 
    caller_env()
  )
}
power(0.5)
## function (x) 
## {
##     x^0.5
## }

19.7.5 Exercises

  1. In the linear-model example, we could replace the expr() in reduce(summands, ~ expr(!!.x + !!.y)) with call2(): reduce(summands, call2, "+"). Compare and contrast the two approaches. Which do you think is easier to read?
intercept <- 10
coefs <- c(x1 = 5, x2 = -4)
coef_sym <- syms(names(coefs))
summands <- map2(coef_sym, coefs, ~ expr((!!.x * !!.y)))
summands <- c(intercept, summands)


eq <- reduce(summands, ~ expr(!!.x + !!.y))

eq <- reduce(summands, call2, "+")

  1. Re-implement the Box-Cox transform defined below using unquoting and new_function():
bc <- function(lambda) {
  if (lambda == 0) {
    function(x) log(x)
  } else {
    function(x) (x ^ lambda - 1) / lambda
  }
}
bc <- function(lambda){
  if(lambda==0){
    new_function(exprs(x=),
                 expr(log(x)))
  }else{
    new_function(exprs(x=),
                 expr((x^lambda - 1)/lambda))
  }
}
bc(0)
## function (x) 
## log(x)
## <environment: 0x7feb1ccd0350>
bc(10)
## function (x) 
## (x^lambda - 1)/lambda
## <environment: 0x7feb1cc1fbe8>
  1. Re-implement the simple compose() defined below using quasiquotation and new_function():
compose <- function(f, g) {
  function(...) f(g(...))
}
compose <- function(f, g){
  new_function(exprs(...=),
               expr(f(g(...))))
}

Anne Hoffrichter