When used alone, quasiquotation is most useful for programming, particularly for generating code. But when it’s combined with the other techniques, tidy evaluation becomes a powerful tool for data analysis.
library(rlang)
library(purrr)
##
## Attaching package: 'purrr'
## The following objects are masked from 'package:rlang':
##
## %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
## flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
## splice
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Let’s see if Hadley’s motivation works for us!
paste("Good", "morning", "Hadley")
## [1] "Good morning Hadley"
paste("Good", "afternoon", "Alice")
## [1] "Good afternoon Alice"
When using cement()
every argument is automatically quoted:
cement <- function(...) {
args <- ensyms(...)
paste(purrr::map(args, as_string), collapse = " ")
}
cement(Good, morning, Hadley)
## [1] "Good morning Hadley"
cement(Good, afternoon, Alice)
## [1] "Good afternoon Alice"
The problem comes when we want to use variables:
name <- "Hadley"
time <- "morning"
paste("Good", time, name)
## [1] "Good morning Hadley"
cement(Good, time, name)
## [1] "Good time name"
The tool of choice is “unquote” !!
, pronounced “bang-bang”
cement(Good, !!time, !!name)
## [1] "Good morning Hadley"
paste()
evaluates its arguments, so we must quote where needed; cement()
quotes its arguments, so we must unquote where needed.
var <- "new_name"
mtcars %>% rename(!!var := mpg) %>% head()
Unfortunately R is very strict about the kind of expressions supported on the LHS of =. This is why rlang interprets the walrus operator := as an alias of =. You can use it to supply names, e.g. a := b is equivalent to a = b. Since its syntax is more flexible you can also force names on its LHS:
An evaluated argument obeys R’s usual evaluation rules.
A quoted argument is captured by the function, and is processed in some custom way.
library(MASS)
mtcars2 <- subset(mtcars, cyl == 4)
with(mtcars2, sum(vs))
sum(mtcars2$am)
rm(mtcars2)
library(MASS) # MASS is quoted
mtcars2 <- subset(mtcars, cyl == 4) # mtcars is evaluated, cyl is quoted
mtcars
cyl
with(mtcars2, sum(vs)) # mtcars2 is evaluated, vs is quoted
mtcars2
vs
sum(mtcars2$am) # mtcars2$am is evaluated
mtcars2$am
rm(mtcars2) # mtcars2 is evaluated??
library(dplyr)
library(ggplot2)
by_cyl <- mtcars %>%
group_by(cyl) %>%
summarise(mean = mean(mpg))
ggplot(by_cyl, aes(cyl, mean)) + geom_point()
library(dplyr) # dplyr is quoted
dplyr
library(ggplot2) # ggplot2 is quoted
by_cyl <- mtcars %>%
group_by(cyl) %>%
summarise(mean = mean(mpg))
# mtcars is evaluated, cyl is quoted, mpg is quoted
ggplot(by_cyl, aes(cyl, mean)) + geom_point()
# by_cyl is evaluated, cyl & mean are quoted
The first part of quasiquotation is quotation: capturing an expression without evaluating it.
(rlang
functions):
Capturing expressions
expr(x + y)
## x + y
expr(1 / 2 / 3)
## 1/2/3
expr()
: Capturing an argument exactly as provided. Not so useful inside of a function:
f1 <- function(x) expr(x)
f1(a + b + c)
## x
For that purpose you can use: enexpr()
:
f2 <- function(x) enexpr(x)
f2(a + b + c)
## a + b + c
To capture multiple arguments in ...
use enexprs()
:
f <- function(...) enexprs(...)
f(x = 1, y = 10 * z)
## $x
## [1] 1
##
## $y
## 10 * z
exprs(x = x ^ 2, y = y ^ 3, z = z ^ 4)
## $x
## x^2
##
## $y
## y^3
##
## $z
## z^4
# shorthand for
# list(x = expr(x ^ 2), y = expr(y ^ 3), z = expr(z ^ 4))
Capturing symbols
ensym()
and ensyms()
check that the captured expression is either a symbol or a string (which is converted to a symbol):
f <- function(...) ensyms(...)
f(x)
## [[1]]
## x
#> [[1]]
#> x
f("x")
## [[1]]
## x
#> [[1]]
#> x
The base equivalents do not support unquoting.
rlang quasiquoting functions:
Developer | User | |
---|---|---|
One | expr() |
enexpr() |
Many | exprs() |
enexprs() |
base R quoting functions:
Developer | User | |
---|---|---|
One | quote() |
substitute() |
Many | alist() |
as.list(substitute(...())) |
expr()
implemented? Look at its source code.expr()
implemented? Look at its source code.expr
f1 <- function(x, y) {
exprs(x = x, y = y)
}
f2 <- function(x, y) {
enexprs(x = x, y = y)
}
f1(a + b, c + d)
f2(a + b, c + d)
enexpr()
with an expression (i.e. enexpr(x + y)
? What happens if enexpr()
is passed a missing argument?enexpr()
with an expression (i.e. enexpr(x + y)
? What happens if enexpr()
is passed a missing argument?enexpr(x + y)
## Error: `arg` must be a symbol
arg must be a symbol
enexpr(missing_arg())
## Error: `arg` must be a symbol
is.symbol(missing_arg())
## [1] TRUE
exprs(a)
and exprs(a = )
different? Think about both the input and the output.exprs(a)
and exprs(a = )
different? Think about both the input and the output.exprs(a)
## [[1]]
## a
exprs(a = )
## $a
exprs(a)
input: symbol
output: list entry value
exprs(a = )
input: name of argument
output: list entry name
exprs()
and alist()
? Read the documentation for the named arguments of exprs()
to find out.substitute()
says:Substitution takes place by examining each component of the parse tree as follows:
- If it is not a bound symbol in env, it is unchanged.
- If it is a promise object (i.e., a formal argument to a function) the expression slot of the promise replaces the symbol.
- If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.
Create examples that illustrate each of the above cases.
Unquoting is one inverse of quoting. It allows you to selectively evaluate code inside expr()
, so that expr(!!x)
is equivalent to x
.
Unqouting one argument
!!
works with call objects:
x <- expr(-1)
expr(f(!!x, y))
## f(-1, y)
lobstr::ast(expr(f(!!x, y)))
## █─expr
## └─█─f
## ├─█─`-`
## │ └─1
## └─y
… as well as for symbols and constants:
a <- sym("y")
b <- 1
expr(f(!!a, !!b))
## f(y, 1)
lobstr::ast(expr(f(!!a, !!b)))
## █─expr
## └─█─f
## ├─y
## └─1
You can also use !!
with function calls. The function call will be evaluated and the result will be inserted:
mean_rm <- function(var) {
var <- ensym(var)
expr(mean(!!var, na.rm = TRUE))
}
expr(!!mean_rm(x) + !!mean_rm(y))
## mean(x, na.rm = TRUE) + mean(y, na.rm = TRUE)
!!
preserves operator precedence:
x1 <- expr(x + 1)
x2 <- expr(x + 2)
expr(!!x1 / !!x2)
## (x + 1)/(x + 2)
lobstr::ast(expr(!!x1 / !!x2))
## █─expr
## └─█─`/`
## ├─█─`+`
## │ ├─x
## │ └─1
## └─█─`+`
## ├─x
## └─2
lobstr::ast(expr(x+1/x+2))
## █─expr
## └─█─`+`
## ├─█─`+`
## │ ├─x
## │ └─█─`/`
## │ ├─1
## │ └─x
## └─2
Unquoting a function
f <- expr(foo)
expr((!!f)(x, y))
## foo(x, y)
lobstr::ast(expr((!!f)(x, y)))
## █─expr
## └─█─foo
## ├─x
## └─y
Because of the large number of parentheses involved, it can be clearer to use rlang::call2()
:
f <- expr(foo)
call2(f, expr(x), expr(y))
## foo(x, y)
Unquoting a missing argument
We talked about missing arguments in Section 18.6.2.
arg <- missing_arg()
expr(foo(!!arg, !!arg))
## Error in enexpr(expr): argument "arg" is missing, with no default
Here we can use the helper function maybe_missing()
:
expr(foo(!!maybe_missing(arg), !!maybe_missing(arg)))
## foo(, )
Unquoting in special forms
can be problematic, for example with $
:
expr(df$!!x)
## Error: <text>:1:9: unexpected '!'
## 1: expr(df$!
## ^
x <- expr(x)
expr(`$`(df, !!x))
## df$x
Unquoting many arguments
!!!
(called unquote-splice, pronounced bang-bang-bang) takes a list of expressions and inserts them at the location of the !!!
:
xs <- exprs(1, a, -b)
expr(f(!!!xs, y))
## f(1, a, -b, y)
lobstr::ast(f(!!!xs, y))
## █─f
## ├─1
## ├─a
## ├─█─`-`
## │ └─b
## └─y
The polite fiction of !!
When used outside of an expression context !!
and !!!
actually just work as repeated application of !
!!TRUE
## [1] TRUE
!!!TRUE
## [1] FALSE
bang <- 1000
!!bang
## [1] TRUE
!!!bang
## [1] FALSE
That means you can get argument type errors:
x <- quote(variable)
!!x
## Error in !x: invalid argument type
!x
## Error in !x: invalid argument type
..or silently incorrect results:
df <- data.frame(x = 1:5)
boom <- 100
with(df, x + !!boom)
## [1] 2 3 4 5 6
!!boom
## [1] TRUE
as.numeric(!!boom)
## [1] 1
Non-standard ASs or some more things to confuse us:
For example, if you inline more complex objects, their attributes are not printed. This can lead to confusing output:
x1 <- expr(class(!!data.frame(x = 10)))
x1
## class(list(x = 10))
eval(x1)
## [1] "data.frame"
Two tools to reduce confusion:
expr_print(x1)
## class(<df[,1]>)
lobstr::ast(!!x1)
## █─class
## └─<inline data.frame>
Another example: Inlining integer sequences:
x2 <- expr(f(!!c(1L, 2L, 3L, 4L, 5L)))
x2
## f(1:5)
expr_print(x2)
## f(<int: 1L, 2L, 3L, 4L, 5L>)
lobstr::ast(!!x2)
## █─f
## └─<inline integer>
xy <- expr(x + y)
xz <- expr(x + z)
yz <- expr(y + z)
abc <- exprs(a, b, c)
Use quasiquotation to construct the following calls:
(x + y) / (y + z)
-(x + z) ^ (y + z)
(x + y) + (y + z) - (x + y)
atan2(x + y, y + z)
sum(x + y, x + y, y + z)
sum(a, b, c)
mean(c(a, b, c), na.rm = TRUE)
foo(a = x + y, b = y + z)
xy <- expr(x + y)
xz <- expr(x + z)
yz <- expr(y + z)
abc <- exprs(a, b, c)
(x + y) / (y + z)
expr(!!xy/!!yz)
## (x + y)/(y + z)
-(x + z) ^ (y + z)
expr(-(!!xz)^(!!yz))
## -(x + z)^(y + z)
(x + y) + (y + z) - (x + y) ?
expr(!!xy + !!yz-!!xy)
## x + y + (y + z) - (x + y)
atan2(x + y, y + z)
expr(atan2(!!xy, !!yz))
## atan2(x + y, y + z)
sum(x + y, x + y, y + z)
expr(sum(!!xy, !!xy, !!yz))
## sum(x + y, x + y, y + z)
mean(c(a, b, c), na.rm = TRUE) !!!
expr(mean(c(!!!abc), na.rm=TRUE))
## mean(c(a, b, c), na.rm = TRUE)
foo(a = x + y, b = y + z)
expr(foo(a=!!xy, b=!!yz))
## foo(a = x + y, b = y + z)
(a <- expr(mean(1:10)))
#> mean(1:10)
(b <- expr(mean(!!(1:10))))
#> mean(1:10)
identical(a, b)
#> [1] FALSE
What’s the difference? Which one is more natural?
(a <- expr(mean(1:10)))
## mean(1:10)
#> mean(1:10)
(b <- expr(mean(!!(1:10))))
## mean(1:10)
#> mean(1:10)
identical(a, b)
## [1] FALSE
expr_print(a)
## mean(1:10)
lobstr::ast(expr(mean(1:10)))
## █─expr
## └─█─mean
## └─█─`:`
## ├─1
## └─10
expr_print(b)
## mean(<int: 1L, 2L, 3L, 4L, 5L, ...>)
lobstr::ast(expr(mean(!!(1:10))))
## █─expr
## └─█─mean
## └─<inline integer>
To me personally the first one feels more natural, as it prints the expression that I would normally write.
Tyler Grant Smith provided an answer to our question for exercise 19.4.8.-1.
Given the following components:
xy <- expr(x + y)
xz <- expr(x + z)
yz <- expr(y + z)
use quasiquotation to construct the following calls:
(x + y) + (y + z) - (x + y)
expr(((!!xy)) + !!yz-!!xy)
## (x + y) + (y + z) - (x + y)
There is one quasiquotation couple in base R:
bquote()
for quoting and .()
for “unquoting”.
xyz <- bquote((x + y + z))
bquote(-.(xyz) / 2)
## -(x + y + z)/2
But bquote()
is rarely used by any other function written in R.
Base functions that quote an argument use some other technique to allow indirect specification. Base R approaches selectively turn quoting off, rather than using unquoting, so I call them non-quoting techniques.
There are four basic forms of non-quoting used in base R functions:
quoting $
- non-quoting [[]]
:
x <- list(var = 1, y = 2)
var <- "y"
x$var # quoted
## [1] 1
x[[var]] # takes variable as a string
## [1] 2
Other examples: <-
/assign()
and ::
/getExportedValue()
rm()
allows you to provide bare variable names in ...
, or a character vector of variable names in list
:
x <- 1
rm(x)
y <- 2
vars <- c("y", "vars")
rm(list = vars)
Other examples: data()
and save()
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
pkg <- "MASS"
library(pkg, character.only = TRUE)
Other examples: demo()
, detach()
, example()
, and require()
# Shows help for var
help(var)
## No documentation for 'y' in specified packages and libraries:
## you could try '??y'
var <- "mean"
# Shows help for mean
help(var)
var <- 10
# Shows help for var
help(var)
Other examples: ls()
, page()
, and match.fun()
...
(dot-dot-dot)Motivation!
rbind()
a list of data frames together?dfs <- list(
a = data.frame(x = 1, y = 2),
b = data.frame(x = 3, y = 4)
)
We could do: rbind(dfs$a, dfs$b)
or:
dplyr::bind_rows(!!!dfs)
with base R, we can use the do.call()
function:
do.call("rbind", dfs)
var <- "x"
val <- c(4, 3, 9)
We could do: setNames(data.frame(val), var)
or:
tibble::tibble(!!var := val)
with base R:
args <- list(val)
names(args) <- var
do.call("data.frame", args)
… Functions that support these tools, without quoting arguments, have tidy dots. To gain tidy dots behaviour in your own function, all you need to do is use list2().
?list2
–> dynamic dots
The ...
syntax of base R allows you to:
c()
or list()
.Dynamic dots offer a few additional features:
!!!
.!!
on the left-hand side of :=
.tibble::tibble(
y = 1:5,
z = 3:-1,
x = 5:1,
)
Example (how you could use list2()
)
set_attr <- function(.x, ...) {
attr <- rlang::list2(...)
attributes(.x) <- attr
.x
}
attrs <- list(x = 1, y = 2)
attr_name <- "z"
1:10 %>%
set_attr(w = 0, !!!attrs, !!attr_name := 3) %>%
str()
## int [1:10] 1 2 3 4 5 6 7 8 9 10
## - attr(*, "w")= num 0
## - attr(*, "x")= num 1
## - attr(*, "y")= num 2
## - attr(*, "z")= num 3
exec()
# Directly
exec("mean", x = 1:10, na.rm = TRUE, trim = 0.1)
## [1] 5.5
# Indirectly
args <- list(x = 1:10, na.rm = TRUE, trim = 0.1)
exec("mean", !!!args)
## [1] 5.5
# Mixed
params <- list(na.rm = TRUE, trim = 0.1)
exec("mean", x = 1:10, !!!params)
## [1] 5.5
mean(!!!args)
## Error in !args: invalid argument type
Supply argument names indirectly:
arg_name <- "na.rm"
arg_val <- TRUE
exec("mean", 1:10, !!arg_name := arg_val)
## [1] 5.5
Call multiple functions with the same arguments:
x <- c(runif(10), NA)
funs <- c("mean", "median", "sd")
purrr::map_dbl(funs, exec, x, na.rm = TRUE)
## [1] 0.5551995 0.5679720 0.2362938
exec()
is shown below. Describe how it works. What are the key ideas?exec <- function(f, ..., .env = caller_env()) {
args <- list2(...)
do.call(f, args, envir = .env)
}
adv. R solutions:
exec()
takes a function (f
), its arguments (...
) and an environment (.env
) as input. This allows to construct a call fromf
and...
and evaluate this call in the supplied environment. As the...
argument is handled vialist2()
,exec()
supports tidy dots (quasiquotation), which means that arguments and names (on the left-hand side of:=
) can be unquoted via!!
and!!!
.
interaction()
, expand.grid()
, and par()
. Compare and contrast the techniques they use for switching between dots and list behaviour.set_attr()
set_attr <- function(x, ...) {
attr <- rlang::list2(...)
attributes(x) <- attr
x
}
set_attr(1:10, x = 10)
## Error in attributes(x) <- attr: attributes must be named
lobstr::ast()
z <- expr(foo(x, y))
lobstr::ast(z)
## z
lobstr::ast(!!z)
## █─foo
## ├─x
## └─y
Map-reduce to generate code
On the example of generating the expression of a linear model with following coefficients.
intercept <- 10
coefs <- c(x1 = 5, x2 = -4)
coef_sym <- syms(names(coefs))
coef_sym
## [[1]]
## x1
##
## [[2]]
## x2
summands <- map2(coef_sym, coefs, ~ expr((!!.x * !!.y)))
summands
## [[1]]
## (x1 * 5)
##
## [[2]]
## (x2 * -4)
summands <- c(intercept, summands)
summands
## [[1]]
## [1] 10
##
## [[2]]
## (x1 * 5)
##
## [[3]]
## (x2 * -4)
eq <- reduce(summands, ~ expr(!!.x + !!.y))
eq
## 10 + (x1 * 5) + (x2 * -4)
Creating functions
new_function(
exprs(x = , y = ),
expr({x + y})
)
## function (x, y)
## {
## x + y
## }
Raise a number to the power of a number:
power <- function(exponent) {
new_function(
exprs(x = ),
expr({
x ^ !!exponent
}),
caller_env()
)
}
power(0.5)
## function (x)
## {
## x^0.5
## }
expr()
in reduce(summands, ~ expr(!!.x + !!.y))
with call2()
: reduce(summands, call2, "+")
. Compare and contrast the two approaches. Which do you think is easier to read?intercept <- 10
coefs <- c(x1 = 5, x2 = -4)
coef_sym <- syms(names(coefs))
summands <- map2(coef_sym, coefs, ~ expr((!!.x * !!.y)))
summands <- c(intercept, summands)
eq <- reduce(summands, ~ expr(!!.x + !!.y))
eq <- reduce(summands, call2, "+")
new_function()
:bc <- function(lambda) {
if (lambda == 0) {
function(x) log(x)
} else {
function(x) (x ^ lambda - 1) / lambda
}
}
bc <- function(lambda){
if(lambda==0){
new_function(exprs(x=),
expr(log(x)))
}else{
new_function(exprs(x=),
expr((x^lambda - 1)/lambda))
}
}
bc(0)
## function (x)
## log(x)
## <environment: 0x7feb1ccd0350>
bc(10)
## function (x)
## (x^lambda - 1)/lambda
## <environment: 0x7feb1cc1fbe8>
compose()
defined below using quasiquotation and new_function()
:compose <- function(f, g) {
function(...) f(g(...))
}
compose <- function(f, g){
new_function(exprs(...=),
expr(f(g(...))))
}