18.3 Expression

Collectively, the data structures present in the AST are called expressions.
These include:
1. Constants
2. Symbols
3. Calls
4. Pairlists

18.3.1 Constants

Scalar constants are the simplest component of the AST.
A constant is either NULL or a length-1 atomic vector (or scalar)
- e.g., TRUE, 1L, 2.5, "x", or "hello".
We can test for a constant with rlang::is_syntactic_literal().
Constants are self-quoting in the sense that the expression used to represent a constant is the same constant:

identical(expr(TRUE), TRUE)
#> [1] TRUE
identical(expr(1), 1)
#> [1] TRUE
identical(expr(2L), 2L)
#> [1] TRUE
identical(expr("x"), "x")
#> [1] TRUE
identical(expr("hello"), "hello")
#> [1] TRUE

18.3.2 Symbols

A symbol represents the name of an object.
- x
- mtcars
- mean
In base R, the terms symbol and name are used interchangeably (i.e., is.name() is identical to is.symbol()), but this book used symbol consistently because “name” has many other meanings.
You can create a symbol in two ways:
1. by capturing code that references an object with expr().
2. turning a string into a symbol with rlang::sym().

expr(x)
#> x

sym("x")
#> x

A symbol can be turned back into a string with as.character() or rlang::as_string().
as_string() has the advantage of clearly signalling that you’ll get a character vector of length 1.

as_string(expr(x))
#> [1] "x"

We can recognize a symbol because it is printed without quotes

expr(x)
#> x

str() tells you that it is a symbol, and is.symbol() is TRUE:

str(expr(x))
#>  symbol x

is.symbol(expr(x))
#> [1] TRUE

The symbol type is not vectorised, i.e., a symbol is always length 1.
If you want multiple symbols, you’ll need to put them in a list, using rlang::syms().

Note that as_string() will not work on expressions which are not symbols.

as_string(expr(x+y))
#> Error in `as_string()`:
#> ! Can't convert a call to a string.

18.3.3 Calls

A call object represents a captured function call.
Call objects are a special type of list.
- The first component specifies the function to call (usually a symbol, i.e., the name fo the function).
- The remaining elements are the arguments for that call.
Call objects create branches in the AST, because calls can be nested inside other calls.
You can identify a call object when printed because it looks just like a function call.
Confusingly typeof() and str() print language for call objects (where we might expect it to return that it is a “call” object), but is.call() returns TRUE:

lobstr::ast(read.table("important.csv", row.names = FALSE))
#> █─read.table 
#> ├─"important.csv" 
#> └─row.names = FALSE

x <- expr(read.table("important.csv", row.names = FALSE))

typeof(x)
#> [1] "language"

is.call(x)
#> [1] TRUE

18.3.4 Subsetting

Calls generally behave like lists.
Since they are list-like, you can use standard subsetting tools.
The first element of the call object is the function to call, which is usually a symbol:

x[[1]]
#> read.table

is.symbol(x[[1]])
#> [1] TRUE

The remainder of the elements are the arguments:

is.symbol(x[-1])
#> [1] FALSE
as.list(x[-1])
#> [[1]]
#> [1] "important.csv"
#> 
#> $row.names
#> [1] FALSE

We can extract individual arguments with [[ or, if named, $:

x[[2]]
#> [1] "important.csv"

x$row.names
#> [1] FALSE

We can determine the number of arguments in a call object by subtracting 1 from its length:

length(x) - 1
#> [1] 2

Extracting specific arguments from calls is challenging because of R’s flexible rules for argument matching:
- It could potentially be in any location, with the full name, with an abbreviated name, or with no name.
To work around this problem, you can use rlang::call_standardise() which standardizes all arguments to use the full name:

rlang::call_standardise(x)
#> Warning: `call_standardise()` is deprecated as of rlang 0.4.11
#> This warning is displayed once every 8 hours.
#> read.table(file = "important.csv", row.names = FALSE)

But If the function uses … it’s not possible to standardise all arguments.
Calls can be modified in the same way as lists:

x$header <- TRUE
x
#> read.table("important.csv", row.names = FALSE, header = TRUE)

18.3.5 Function position

The first element of the call object is the function position. This contains the function that will be called when the object is evaluated, and is usually a symbol.

lobstr::ast(foo())
#> █─foo

While R allows you to surround the name of the function with quotes, the parser converts it to a symbol:

lobstr::ast("foo"())
#> █─foo

However, sometimes the function doesn’t exist in the current environment and you need to do some computation to retrieve it:
- For example, if the function is in another package, is a method of an R6 object, or is created by a function factory. In this case, the function position will be occupied by another call:

lobstr::ast(pkg::foo(1))
#> █─█─`::` 
#> │ ├─pkg 
#> │ └─foo 
#> └─1

lobstr::ast(obj$foo(1))
#> █─█─`$` 
#> │ ├─obj 
#> │ └─foo 
#> └─1

lobstr::ast(foo(1)(2))
#> █─█─foo 
#> │ └─1 
#> └─2

18.3.6 Constructing

You can construct a call object from its components using rlang::call2().
The first argument is the name of the function to call (either as a string, a symbol, or another call).
The remaining arguments will be passed along to the call:

call2("mean", x = expr(x), na.rm = TRUE)
#> mean(x = x, na.rm = TRUE)

call2(expr(base::mean), x = expr(x), na.rm = TRUE)
#> base::mean(x = x, na.rm = TRUE)

Infix calls created in this way still print as usual.

call2("<-", expr(x), 10)
#> x <- 10