Three pillars of tidy evaluation
Quasiquotation = quotation + unquotation
On it’s own, Quasiquotation good for programming, but combined with other tools, important for data analysis.
Simple concrete example:
cement()
is a function that works like paste()
but doesn’t need need quotes
(Think of automatically adding ‘quotes’ to the arguments)
cement <- function(...) {
args <- ensyms(...)
paste(purrr::map(args, as_string), collapse = " ")
}
cement(Good, morning, Hadley)
#> [1] "Good morning Hadley"
What if we wanted to use variables? What is an object and what should be quoted?
This is where ‘unquoting’ comes in!
Can think of cement()
and paste()
as being ‘mirror-images’ of each other.
paste()
- define what to quote - Evaluates argumentscement()
- define what to unquote - Quotes argumentsQuoting function similar to, but more precise than, Non-standard evaluation (NSE)
dplyr::mutate()
, tidyr::pivot_longer()
library()
, subset()
, with()
Quoting function arguments cannot be evaluated outside of function:
#> [1] "Good afternoon Cohort"
#> Error: object 'Good' not found
Non-quoting (standard) function arguments can be evaluated:
Capture expressions without evaluating them
#> Warning in body[[col]][rows][!is.na(result)] <- omit_na(result): number of
#> items to replace is not a multiple of replacement length
Developer | User | |
---|---|---|
Expression (Quasiquotation) | ||
One | expr() |
enexpr() |
Many | exprs() |
enexprs() |
Symbol (Quasiquotation) | ||
One | expr() |
ensym() |
Many | exprs() |
ensyms() |
R Base (Quotation) | ||
One | quote() |
alist() |
Many | substitute() |
as.list(substitute(...())) |
Also:
bquote()
provides a limited form of quasiquotation~
, the formula, is a quoting function (see Section 20.3.4)expr()
and exprs()
enexpr()
1 and enexprs()
#> a + b + c
#> $exp1
#> a + b
#>
#> $exp2
#> c + d
ensym()
and ensyms()
enexpr/s()
Selectively evaluate parts of an expression
!!
(unquote, bang-bang)
call2()
)!!!
(unquote-splice, bang-bang-bang, triple bang)!!
and !!!
only work like this inside quoting function using rlangOne argument
Multiple arguments
For example, get the AST of an expression
Unquote function call
Unquote function
Only bquote()
provides a limited form of quasiquotation.
The rest of base selectively uses or does not use quoting (rather than unquoting).
Four basic forms of quoting/non-quoting:
$
(quoting) and [[
(non-quoting)rm(...)
(quoting) and rm(list = c(...))
(non-quoting)library(rlang)
(quoting) and library(pkg, character.only = TRUE)
(where pkg <- "rlang"
)help(var)
- Quote, show help for varhelp(var)
(where var <- "mean"
) - No quote, show help for meanhelp(var)
(where var <- 10
) - Quote fails, show help for var...
)!!
and !!!
only work with functions that use rlanglist2(...)
to turn ...
into “tidy dots” which can be unquoted and splicedlist2()
if going to be passing or using !!
or !!!
in ...
list2()
is a wrapper around dots_list()
with the most common defaultsNo need for list2()
Require list2()
vars <- list(x = c(1:3), y = c(2, 4, 6))
d(!!!vars)
#> Error in !vars: invalid argument type
d2 <- function(...) data.frame(list2(...))
d2(!!!vars)
#> x y
#> 1 1 2
#> 2 2 4
#> 3 3 6
# Same result but x and y evaluated later
vars_expr <- exprs(x = c(1:3), y = c(2, 4, 6))
d2(!!!vars_expr)
#> x y
#> 1 1 2
#> 2 2 4
#> 3 3 6
Getting argument names (symbols) from variables
exec()
[Making your own …]What if your function doesn’t have tidy dots?
Can’t use !!
or :=
if doesn’t support rlang or dynamic dots
Let’s use the … from exec()
Note that you do not unquote arg_val
.
Also exec
is useful for mapping over a list of functions:
do.call
do.call(what, args)
what
is a function to callargs
is a list of arguments to pass to the function.One way to implement exec
is shown here: Describe how it works. What are the key ideas?
Sometimes you want to run a bunch of models, without having to copy/paste each one.
BUT, you also want the summary function to show the appropriate model call, not one with hidden variables (e.g., lm(y ~ x, data = data)
).
We can achieve this by building expressions and unquoting as needed:
library(purrr)
vars <- data.frame(x = c("hp", "hp"),
y = c("mpg", "cyl"))
x_sym <- syms(vars$x)
y_sym <- syms(vars$y)
formulae <- map2(x_sym, y_sym, \(x, y) expr(!!y ~ !!x))
formulae
#> [[1]]
#> mpg ~ hp
#>
#> [[2]]
#> cyl ~ hp
models <- map(formulae, \(f) expr(lm(!!f, data = mtcars)))
summary(eval(models[[1]]))
#>
#> Call:
#> lm(formula = mpg ~ hp, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -5.7121 -2.1122 -0.8854 1.5819 8.2360
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 30.09886 1.63392 18.421 < 2e-16 ***
#> hp -0.06823 0.01012 -6.742 1.79e-07 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 3.863 on 30 degrees of freedom
#> Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
#> F-statistic: 45.46 on 1 and 30 DF, p-value: 1.788e-07
As a function:
lm_df <- function(df, data) {
x_sym <- map(df$x, as.symbol)
y_sym <- map(df$y, as.symbol)
data <- enexpr(data)
formulae <- map2(x_sym, y_sym, \(x, y) expr(!!y ~ !!x))
models <- map(formulae, \(f) expr(lm(!!f, !!data)))
map(models, \(m) summary(eval(m)))
}
vars <- data.frame(x = c("hp", "hp"),
y = c("mpg", "cyl"))
lm_df(vars, data = mtcars)
#> [[1]]
#>
#> Call:
#> lm(formula = mpg ~ hp, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -5.7121 -2.1122 -0.8854 1.5819 8.2360
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 30.09886 1.63392 18.421 < 2e-16 ***
#> hp -0.06823 0.01012 -6.742 1.79e-07 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 3.863 on 30 degrees of freedom
#> Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
#> F-statistic: 45.46 on 1 and 30 DF, p-value: 1.788e-07
#>
#>
#> [[2]]
#>
#> Call:
#> lm(formula = cyl ~ hp, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.27078 -0.74879 -0.06417 0.63512 1.74067
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 3.006795 0.425485 7.067 7.41e-08 ***
#> hp 0.021684 0.002635 8.229 3.48e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.006 on 30 degrees of freedom
#> Multiple R-squared: 0.693, Adjusted R-squared: 0.6827
#> F-statistic: 67.71 on 1 and 30 DF, p-value: 3.478e-09