Advanced RChapter 17: Metaprograming - Big PictureJosh Pohlkamp-Hartt@JPohlkampHartt2020-11-181 / 19

Outline

Metaprograming Introduction
Code Is Data and also a 🌲?! 🤯
Evaluation of Code
Intro To Quosures

2 / 19

Prerequisites

We will use rlang to introduce the user-friendly/tidy versions of the main methods in metaprogramming.
The lobstr package is used to understand the structure of R code.

library(rlang)
library(lobstr)

3 / 19

What is Metaprogramming?

The idea is that code is data that can be inspected and modified programmatically.
Ex: Why we don't need to quote packages in library() or can write formulas in glm() with +, *, :
We will focus on Tidy Eval rather than base R functions, this is to avoid some of R's ambiguous legacy code.

4 / 19

Non-Standard Programing

What is Non-Standard Programing (NSE)? Very roughly, it is to programmatically modify an expression or its meaning after it is issued but before it is executed.
Example: how R knows to use "hp > 250" on our data frame at run time rather that the traditional mtcars[mtcars$hp > 250,c("mpg","hp")] method where we have to link mtcars to hp.

subset(mtcars%>%dplyr::select(mpg,hp), hp > 250)

##                 mpg  hp
## Ford Pantera L 15.8 264
## Maserati Bora  15.0 335

Hadley opines that NSE is a sloppy definition for this behavior, so we will generally avoid this terminology in these chapters.
R is one of the few languages that allows as much flexibility and accessibility for NSE. Python is jealous.

5 / 19

Why use Metaprogramming?

The ability to modify the meaning of code given context or inputs is extremely powerful
This allows us to use many of the key features in the Tidyverse
R does it by design, so why not use its full power?

6 / 19

Code is Data: Expressions

We can capture code and compute on it like other types of data
Captured code is called an Expression, we use rlang::expr() to do this.

expr(mean(x, na.rm = TRUE))

## mean(x, na.rm = TRUE)

An expression can be one of 4 types of objects:
- calls (representing captured functions)
- symbols (names of an object)
- scalar constants (length 1 atomic vectors)
- pairlists (legacy version of lists used to store the arguments of a function)

7 / 19

Code is Data: Enriching!

Most often we want to deal with code passed through a function, this will not work with expr().

capture_it <- function(x) {
  expr(x)
}
capture_it(a + b + c)

## x

To capture code through a function call, where there is lazy evaluation, we need to use enriched expressions: enexpr().

capture_it <- function(x) {
  enexpr(x)
}
capture_it(a + b + c)

## a + b + c

8 / 19

Code is Data: Inspect or Modify

Once we have captured code, we can interact with it like most data objects
Inspection:

f <- expr(f(x = 1, y = 2))
f[[2]]

## [1] 1

f$x

## [1] 1

Modification:

f$z <- 3
f

## f(x = 1, y = 2, z = 3)

9 / 19

Code is Data: And 🌲's!

Most programming language represents code as a tree, often called the abstract syntax tree.
To view the tree structure we use lobstr::ast(). This function displays the underlying tree structure. Function calls form the branches of the tree and the leaves are symbols and constants.

lobstr::ast(f(a, "b"))

## █─f 
## ├─a 
## └─"b"

This works for all types of functions including the prefix form.

lobstr::ast(1 + 2 * 3)

## █─`+` 
## ├─1 
## └─█─`*` 
##   ├─2 
##   └─3

10 / 19

Code is Data: Code Generator

We can use code to create new trees. There are two main tools: rlang::call2() and unquoting.
call2() constructs a function call from its components: first the function to call, and then the arguments to call it with.

call2("+", 1, 2)

## 1 + 2

call2("+", 1, call2("*", 2, 3))

## 1 + 2 * 3

11 / 19

Code is Data: !!

call2 can be cumbersome for complex operations, an alternative is to build complex code trees by combining simpler code trees with a template. expr() and enexpr() have built-in support for this idea via !!, the unquote operator.
Unquoting allows you to selectively evaluate parts of the expression that would otherwise be quoted, which effectively allows you to merge ASTs using a template AST.
Basically !!x inserts the code tree stored in x into the expression.

xx <- expr(x + x)
yy <- expr(y + y)
expr(!!xx / !!yy)

## (x + x)/(y + y)

12 / 19

Code is Data: !! Cont.

Unquoting is even more useful when wrapped it up into a function.
First we using enexpr() to capture the user’s expression, then expr() and !! to create a new expression using a template.

cv <- function(var) {
  var <- enexpr(var)
  expr(sd(!!var) / mean(!!var))
}
cv(x+y)

## sd(x + y)/mean(x + y)

Capturing (also known as quoting) and unquoting together make up Quasiquotation.
Quasiquotation makes it easy to create functions that combine code written by the function’s author with code written by the function’s user.

13 / 19

Evaluation: How to Evaluate

We can create, modify, and inspect code... what if we want to run it?
We can do that too with base::eval().
Evaluating code relies on an expression and an environment to give the symbols definition.

  eval(cv(x+y),  env(x=rnorm(1000,0,1),y=rnorm(1000,0,1)))

## [1] 177.778

One advantage of evaluating code manually is that you can define the environment. There are two main reasons to do this:
- To temporarily override functions to implement a domain specific language.
- To add a data mask.

A data mask is an environment containing user-supplied objects. Objects in the mask have precedence over objects in the environment (i.e. they mask those objects).

14 / 19

Evaluation: Custom with Functions

In our last example we bound the names x and y to random vectors of length 1000. As we saw in Chapter 6, we also bind names to functions.
This allows us to override the behaviour of existing functions.

string_math <- function(x) {
  e <- env(
    caller_env(),
    `+` = function(x, y) paste0(x, y),
    `-` = function(x, y) gsub(y, "", x),
    `*` = function(x, y) strrep(x, y)
  )
  eval(enexpr(x), e)
}
What <- "Power!"
string_math("More " + What)

## [1] "More Power!"

string_math(("xyz" - "y") * 3)

## [1] "xzxzxz"

15 / 19

Evaluation: Custom with Data

Another application is modifying evaluation to look for variables in a data frame instead of an environment. This idea powers ggplot2::aes() and dplyr::mutate().
It’s possible to use eval() for this, but this is less user friendly and can be restricting, so we’ll switch to rlang::eval_tidy() instead.
The main differences with eval_tidy() are that it takes an expression, environment and a data mask to reduce ambiguity.

df <- data.frame(x = 1:5, y = rep(1,5))
eval_tidy(expr(x + y), df, env(x=11:15))

## [1] 2 3 4 5 6

To ensure we are using the data mask when we mean to, eval_tidy provides us with the pronouns: .data and .env to use in our expressions.

df <- data.frame(x = 1:5, y = rep(1,5))
eval_tidy(expr(.env$x + y), df, env(x=11:15))

## [1] 12 13 14 15 16

16 / 19

Quosures

We can see an issue with our environments in the below example. We would like the value of a to be the one that is visible to the user (10), not internal to the function (1000).

with2 <- function(df, expr) {
  a <- 1000
  eval_tidy(enexpr(expr), df)
}
df <- data.frame(x = 1:3)
a <- 10
with2(df, x + a)

## [1] 1001 1002 1003

We can solve this by using a quosure, which bundles an expression with an environment.
The name is a portmanteau of quoting and closure, because a quosure both quotes the expression and encloses the environment.
Quosures solidifies the internal promise object into something that you can program with.

17 / 19

Quosures Cont.

eval_tidy() is designed to use quosures, just replace the expression and do not supply an environment in the call.

with2 <- function(df, expr) {
  a <- 1000
  eval_tidy(enquo(expr), df)
}
with2(df, x + a)

## [1] 11 12 13

When using a data mask it is best practice to use a quosure.

18 / 19

Summary

Metaprogramming lets us modify code after definition and before evaluation
Captured code = Expressions and they act like data (Chapter 18)
Quasiquotation allows us to interactively generate code (Chapter 19)
Evaluation allows for customization (Chapter 20)

19 / 19

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help