Expressions

To compute on the language, we first need to understand its structure.

Learning objectives:

  • Capture code expressions
  • Inspect expressions
  • Define expressions
  • Modify expressions
  • Generate/Automate expressions
library(rlang)
library(lobstr)

expr captures code without executing

  • Distinguishes the operation from the result
  • Capture code as an expression
z <- rlang::expr(x - 10)
z
#> x - 10
rlang::is_expression(z)
#> [1] TRUE
class(z)
#> [1] "call"
  • Evaluate the code expression immediately
y <- x - 10
#> Error:
#> ! object 'x' not found

eval evaluates the code expression

z <- rlang::expr(x - 10)
x <- 10
eval(z)
#> [1] 0

Note

This chapter focuses on capturing and inspecting the operation. eval will be discussed more in chapters 19 and 20.

ast parses and identifies the parts of an expression

  • Expressions are also called abstract syntax trees (ASTs) because of the hierarchical structure and natural tree representation.
lobstr::ast(y <- x - 10)
#> █─`<-` 
#> ├─y 
#> └─█─`-` 
#>   ├─x 
#>   └─10
lobstr::ast(`<-`(y,`*`(x,10)))
#> █─`<-` 
#> ├─y 
#> └─█─`*` 
#>   ├─x 
#>   └─10

The leaves, branches, and colors from ast identify expression data structures

  • Also called scalars
  • Found in ast leaves
  • Leaves have black borders and sharp corners
  • Identify with rlang::is_syntactic_literal
  • Found in ast leaves
  • Leaves have purple borders and rounded corners
  • Identify with is.name or is.symbol
  • Convert strings to symbols with rlang::sym or rlang::syms
  • Convert a symbol to a string with rlang::as_string or as.character
  • Found in ast branches.
  • Branches are orange rectangles
  • First child is the function call name
  • Subsequent children are arguments to the function call
  • Calls behave as lists
  • typeof and str return “language” for call objects

Expressions and lists have similar memory mapping

x <- rlang::expr(y <- x * 10)
length(x)
#> [1] 3
is.list(x)
#> [1] FALSE
lobstr::obj_addr(x)
#> [1] "0x1d950654c40"
lobstr::obj_addr(x[[1]])
#> [1] "0x1d94815bff0"
lobstr::obj_addr(x[[2]])
#> [1] "0x1d948b7b9e0"
lobstr::obj_addr(x[[3]])
#> [1] "0x1d950654ce8"
lst_x <- as.list(x)
# Will not work: lobstr::obj_addrs(x)
lobstr::obj_addrs(lst_x)
#> [1] "0x1d94815bff0" "0x1d948b7b9e0" "0x1d950654ce8"

Expressions can be modified using list subsetting

x <- rlang::expr(10*2)
str(x)
#>  language 10 * 2
rlang::is_expression(x)
#> [1] TRUE
purrr::map(as.list(x),~{.x})
#> [[1]]
#> `*`
#> 
#> [[2]]
#> [1] 10
#> 
#> [[3]]
#> [1] 2
tracemem(x)
#> [1] "<000001D95238AD28>"
x
#> 10 * 2
x[[2]] <- 4
#> tracemem[0x000001d95238ad28 -> 0x000001d9545baf90]: eval eval withVisible withCallingHandlers eval eval with_handlers doWithOneRestart withOneRestart withRestartList doWithOneRestart withOneRestart withRestartList withRestarts <Anonymous> evaluate in_dir in_input_dir eng_r block_exec call_block process_group withCallingHandlers with_options <Anonymous> process_file <Anonymous> <Anonymous> execute .main
x
#> 4 * 2
untracemem(x)

Simple expressions can be generated using call2, parse_expr, or expr_text

  • Clunky when creating more complex expressions, see chapter 19 for more details
rlang::call2("mean", x = rlang::expr(x), na.rm = TRUE)
#> mean(x = x, na.rm = TRUE)
  • parsing: String to expression
  • More details and safer usage in chapter 19
  • base::parse (text argument) is equivalent to rlang::parse_expr
"5 - 5" |> 
  rlang::parse_expr()
#> 5 - 5
paste0("5 - ",1:10) |> 
  rlang::parse_exprs()
#> [[1]]
#> 5 - 1
#> 
#> [[2]]
#> 5 - 2
#> 
#> [[3]]
#> 5 - 3
#> 
#> [[4]]
#> 5 - 4
#> 
#> [[5]]
#> 5 - 5
#> 
#> [[6]]
#> 5 - 6
#> 
#> [[7]]
#> 5 - 7
#> 
#> [[8]]
#> 5 - 8
#> 
#> [[9]]
#> 5 - 9
#> 
#> [[10]]
#> 5 - 10
  • deparsing: Expression to string
  • base::deparse outputs a vector when spanning lines
  • The ‘questioning’ lifecycle in the Help page is telling…
class(y ~ x)
#> [1] "formula"
rlang::is_expression(y ~ x)
#> [1] FALSE
rlang::expr_text(y ~ x)
#> [1] "y ~ x"
rlang::expr(x - 8) |> 
  rlang::expr_text()
#> [1] "x - 8"

Complex expressions can be automated using expr and purrr::reduce

and_filters <- 
  purrr::map(
    letters[6:8], #f, g, h
    ~{
      rlang::expr(.data[["x"]] < !!.x)
     }
  ) |> 
    purrr::reduce(
      .f = function(left, right) {
        rlang::expr(!!left & !!right)
      }
    )
tibble::tibble(
  x = letters
) |> 
  dplyr::filter(!!!and_filters)
#> # A tibble: 5 × 1
#>   x    
#>   <chr>
#> 1 a    
#> 2 b    
#> 3 c    
#> 4 d    
#> 5 e
or_filters <- 
  purrr::map(
    letters[6:8], #f, g, h
    ~{
      rlang::expr(.data[["x"]] < !!.x)
     }
  ) |> 
    purrr::reduce(
      .f = function(left, right) {
        rlang::expr(!!left | !!right)
      }
    )
tibble::tibble(
  x = letters
) |> 
  dplyr::filter(!!!or_filters)
#> # A tibble: 7 × 1
#>   x    
#>   <chr>
#> 1 a    
#> 2 b    
#> 3 c    
#> 4 d    
#> 5 e    
#> 6 f    
#> 7 g

There are specialised data structures to be aware of but have been mostly replaced

  • Only seen when working with calls to the function
  • But can treat it as a regular list
f <- rlang::expr(function(x, y = 10) x + y)
f
#> function(x, y = 10) x + y
f[[1]]
#> `function`
f[[2]]
#> $x
#> 
#> 
#> $y
#> [1] 10
typeof(f[[1]])
#> [1] "symbol"
typeof(f[[2]])
#> [1] "pairlist"
  • You only need to care about the missing symbol if you’re programmatically creating functions with missing arguments
  • Use rlang::missing_arg()
  • The ... argument is associated with an empty symbol
f <- rlang::expr(function(x, y = 10) x + y)
rlang::is_missing(
  f[[2]][[1]]
)
#> [1] TRUE
  • Only generated by base::expression and base::parse
  • Their “advantage” is base::eval works across te elements, but this is confusing compared to evaluating across a list of expressions
ex <- expression(x <- 4, x)
class(ex) #just one expression???
#> [1] "expression"
length(ex) #so an expression can be 1 ting and more than 1 thing?? Just give me a list of expressions and define an expression as 1 thing
#> [1] 2