+ - 0:00:00
Notes for current slide
Notes for next slide

Advanced R

Chapter 17: Metaprograming - Big Picture

Josh Pohlkamp-Hartt

@JPohlkampHartt

2020-11-18

1 / 19

Outline

  • Metaprograming Introduction

  • Code Is Data and also a 🌲?! 🤯

  • Evaluation of Code

  • Intro To Quosures

2 / 19

Prerequisites

  • We will use rlang to introduce the user-friendly/tidy versions of the main methods in metaprogramming.

  • The lobstr package is used to understand the structure of R code.

library(rlang)
library(lobstr)

3 / 19

What is Metaprogramming?

  • The idea is that code is data that can be inspected and modified programmatically.

  • Ex: Why we don't need to quote packages in library() or can write formulas in glm() with +, *, :

  • We will focus on Tidy Eval rather than base R functions, this is to avoid some of R's ambiguous legacy code.

4 / 19

Non-Standard Programing

  • What is Non-Standard Programing (NSE)? Very roughly, it is to programmatically modify an expression or its meaning after it is issued but before it is executed.

  • Example: how R knows to use "hp > 250" on our data frame at run time rather that the traditional mtcars[mtcars$hp > 250,c("mpg","hp")] method where we have to link mtcars to hp.

subset(mtcars%>%dplyr::select(mpg,hp), hp > 250)
## mpg hp
## Ford Pantera L 15.8 264
## Maserati Bora 15.0 335
  • Hadley opines that NSE is a sloppy definition for this behavior, so we will generally avoid this terminology in these chapters.

  • R is one of the few languages that allows as much flexibility and accessibility for NSE. Python is jealous.

5 / 19

Why use Metaprogramming?

  • The ability to modify the meaning of code given context or inputs is extremely powerful

  • This allows us to use many of the key features in the Tidyverse

  • R does it by design, so why not use its full power?

6 / 19

Code is Data: Expressions

  • We can capture code and compute on it like other types of data

  • Captured code is called an Expression, we use rlang::expr() to do this.

expr(mean(x, na.rm = TRUE))
## mean(x, na.rm = TRUE)
  • An expression can be one of 4 types of objects:
    • calls (representing captured functions)
    • symbols (names of an object)
    • scalar constants (length 1 atomic vectors)
    • pairlists (legacy version of lists used to store the arguments of a function)
7 / 19

Code is Data: Enriching!

  • Most often we want to deal with code passed through a function, this will not work with expr().
capture_it <- function(x) {
expr(x)
}
capture_it(a + b + c)
## x
  • To capture code through a function call, where there is lazy evaluation, we need to use enriched expressions: enexpr().
capture_it <- function(x) {
enexpr(x)
}
capture_it(a + b + c)
## a + b + c
8 / 19

Code is Data: Inspect or Modify

  • Once we have captured code, we can interact with it like most data objects

  • Inspection:

f <- expr(f(x = 1, y = 2))
f[[2]]
## [1] 1
f$x
## [1] 1
  • Modification:
f$z <- 3
f
## f(x = 1, y = 2, z = 3)
9 / 19

Code is Data: And 🌲's!

  • Most programming language represents code as a tree, often called the abstract syntax tree.

  • To view the tree structure we use lobstr::ast(). This function displays the underlying tree structure. Function calls form the branches of the tree and the leaves are symbols and constants.

lobstr::ast(f(a, "b"))
## █─f
## ├─a
## └─"b"
  • This works for all types of functions including the prefix form.
lobstr::ast(1 + 2 * 3)
## █─`+`
## ├─1
## └─█─`*`
## ├─2
## └─3
10 / 19

Code is Data: Code Generator

  • We can use code to create new trees. There are two main tools: rlang::call2() and unquoting.

  • call2() constructs a function call from its components: first the function to call, and then the arguments to call it with.

call2("+", 1, 2)
## 1 + 2
call2("+", 1, call2("*", 2, 3))
## 1 + 2 * 3
11 / 19

Code is Data: !!

  • call2 can be cumbersome for complex operations, an alternative is to build complex code trees by combining simpler code trees with a template. expr() and enexpr() have built-in support for this idea via !!, the unquote operator.

  • Unquoting allows you to selectively evaluate parts of the expression that would otherwise be quoted, which effectively allows you to merge ASTs using a template AST.

  • Basically !!x inserts the code tree stored in x into the expression.

xx <- expr(x + x)
yy <- expr(y + y)
expr(!!xx / !!yy)
## (x + x)/(y + y)
12 / 19

Code is Data: !! Cont.

  • Unquoting is even more useful when wrapped it up into a function.

  • First we using enexpr() to capture the user’s expression, then expr() and !! to create a new expression using a template.

cv <- function(var) {
var <- enexpr(var)
expr(sd(!!var) / mean(!!var))
}
cv(x+y)
## sd(x + y)/mean(x + y)
  • Capturing (also known as quoting) and unquoting together make up Quasiquotation.

  • Quasiquotation makes it easy to create functions that combine code written by the function’s author with code written by the function’s user.

13 / 19

Evaluation: How to Evaluate

  • We can create, modify, and inspect code... what if we want to run it?

  • We can do that too with base::eval().

  • Evaluating code relies on an expression and an environment to give the symbols definition.

eval(cv(x+y), env(x=rnorm(1000,0,1),y=rnorm(1000,0,1)))
## [1] 177.778
  • One advantage of evaluating code manually is that you can define the environment. There are two main reasons to do this:

    • To temporarily override functions to implement a domain specific language.

    • To add a data mask.

  • A data mask is an environment containing user-supplied objects. Objects in the mask have precedence over objects in the environment (i.e. they mask those objects).
14 / 19

Evaluation: Custom with Functions

  • In our last example we bound the names x and y to random vectors of length 1000. As we saw in Chapter 6, we also bind names to functions.

  • This allows us to override the behaviour of existing functions.

string_math <- function(x) {
e <- env(
caller_env(),
`+` = function(x, y) paste0(x, y),
`-` = function(x, y) gsub(y, "", x),
`*` = function(x, y) strrep(x, y)
)
eval(enexpr(x), e)
}
What <- "Power!"
string_math("More " + What)
## [1] "More Power!"
string_math(("xyz" - "y") * 3)
## [1] "xzxzxz"
15 / 19

Evaluation: Custom with Data

  • Another application is modifying evaluation to look for variables in a data frame instead of an environment. This idea powers ggplot2::aes() and dplyr::mutate().

  • It’s possible to use eval() for this, but this is less user friendly and can be restricting, so we’ll switch to rlang::eval_tidy() instead.

  • The main differences with eval_tidy() are that it takes an expression, environment and a data mask to reduce ambiguity.

df <- data.frame(x = 1:5, y = rep(1,5))
eval_tidy(expr(x + y), df, env(x=11:15))
## [1] 2 3 4 5 6
  • To ensure we are using the data mask when we mean to, eval_tidy provides us with the pronouns: .data and .env to use in our expressions.
df <- data.frame(x = 1:5, y = rep(1,5))
eval_tidy(expr(.env$x + y), df, env(x=11:15))
## [1] 12 13 14 15 16
16 / 19

Quosures

  • We can see an issue with our environments in the below example. We would like the value of a to be the one that is visible to the user (10), not internal to the function (1000).
with2 <- function(df, expr) {
a <- 1000
eval_tidy(enexpr(expr), df)
}
df <- data.frame(x = 1:3)
a <- 10
with2(df, x + a)
## [1] 1001 1002 1003
  • We can solve this by using a quosure, which bundles an expression with an environment.

  • The name is a portmanteau of quoting and closure, because a quosure both quotes the expression and encloses the environment.

  • Quosures solidifies the internal promise object into something that you can program with.

17 / 19

Quosures Cont.

  • eval_tidy() is designed to use quosures, just replace the expression and do not supply an environment in the call.
with2 <- function(df, expr) {
a <- 1000
eval_tidy(enquo(expr), df)
}
with2(df, x + a)
## [1] 11 12 13
  • When using a data mask it is best practice to use a quosure.
18 / 19

Summary

  • Metaprogramming lets us modify code after definition and before evaluation

  • Captured code = Expressions and they act like data (Chapter 18)

  • Quasiquotation allows us to interactively generate code (Chapter 19)

  • Evaluation allows for customization (Chapter 20)

19 / 19

Outline

  • Metaprograming Introduction

  • Code Is Data and also a 🌲?! 🤯

  • Evaluation of Code

  • Intro To Quosures

2 / 19
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow