Environments

Learning objectives

  • Create, modify, and inspect environments

  • Recognize special environments

  • Understand how environments power lexical scoping and namespaces

7.2 Environment Basics

Environments are similar to lists

Generally, an environment is similar to a named list, with four important exceptions:

  • Every name must be unique.

  • The names in an environment are not ordered.

  • An environment has a parent.

  • Environments are not copied when modified.

Create a new environment with {rlang}

e1 <- rlang::env(
  rlang::global_env(),
  a = FALSE,
  b = "a",
  c = 2.3,
  d = 1:3,
)
e2 <- rlang::new_environment(
  data = list(
    a = FALSE,
    b = "a",
    c = 2.3,
    d = 1:3
  ),
  parent = rlang::global_env()
)

An environment associates, or binds a set of names to a set of values

  • A bag of names with no implied order

  • Bindings live within the environment

  • Environments have reference semantics and thus can contain themselves
e1$d <- e1

Inspect environments with {rlang}

rlang::env_print(e1)
#> <environment: 0x55e71594cdd0>
#> Parent: <environment: global>
#> Bindings:
#> • a: <lgl>
#> • b: <chr>
#> • c: <dbl>
#> • d: <int>
rlang::env_names(e1)
#> [1] "a" "b" "c" "d"
rlang::env_has(e1, "a")
#>    a 
#> TRUE
rlang::env_get(e1, "a")
#> [1] FALSE
rlang::env_parent(e1)
#> <environment: R_GlobalEnv>

By default, the current environment is your global environment

  • The current environment is where code is currently executing

  • The global environment is your current environment when working interactively

rlang::current_env()
#> <environment: R_GlobalEnv>
rlang::global_env()
#> <environment: R_GlobalEnv>
base::identical(
  rlang::current_env(),
  rlang::global_env()
)
#> [1] TRUE

Every environment has a parent environment

  • Allows for lexical scoping
e2a <- rlang::env(d = 4, e = 5)

e2b <- rlang::env(e2a, a = 1, b = 2, c = 3)

rlang::env_parent(e2b)
#> <environment: 0x55e71992c5f0>
rlang::env_parents(e2b)
#> [[1]]   <env: 0x55e71992c5f0>
#> [[2]] $ <env: global>

Only the empty environment does not have a parent

e2c <- rlang::env(rlang::empty_env(), d = 4, e = 5)

e2d <- rlang::env(e2c, a = 1, b = 2, c = 3)

All environments eventually terminate with the empty environment

rlang::env_parents(e2b, last = rlang::empty_env())
#>  [[1]]   <env: 0x55e71992c5f0>
#>  [[2]] $ <env: global>
#>  [[3]] $ <env: package:stats>
#>  [[4]] $ <env: package:graphics>
#>  [[5]] $ <env: package:grDevices>
#>  [[6]] $ <env: package:utils>
#>  [[7]] $ <env: package:datasets>
#>  [[8]] $ <env: package:methods>
#>  [[9]] $ <env: Autoloads>
#> [[10]] $ <env: package:base>
#> [[11]] $ <env: empty>

Be wary of using <<-

  • Regular assignment (<-) always creates a variable in the current environment

  • Super assignment (<<-) does a few things:

    1. modifies the variable if it exists in a parent environment

    2. creates the variable in the global environment if it does not exist

Retrieve environment variables with $, [[, or {rlang} functions

e3 <- rlang::env(x = 1, y = 2)

e3$x
#> [1] 1
e3[["x"]]
#> [1] 1
rlang::env_get(e3, "x")
#> [1] 1
e3[[1]]
#> Error in e3[[1]]: wrong arguments for subsetting an environment
e3["x"]
#> Error in e3["x"]: object of type 'environment' is not subsettable

Add bindings to an environment with `$, [[, or {rlang} functions`

e3$z <- 3

e3[["z"]] <- 3

rlang::env_poke(e3, "z", 3)

rlang::env_bind(e3, z = 3, b = 20)

rlang::env_unbind(e3, "z")

Special cases for binding environment variables

  • rlang::env_bind_lazy() creates delayed bindings

    • evaluated the first time they are accessed
  • rlang::env_bind_active() creates active bindings

    • re-computed every time they’re accessed

7.3 Recursing over environments

Explore environments recursively

where <- function(name, env = caller_env()) {
  if (identical(env, empty_env())) {
    # Base case
    stop("Can't find ", name, call. = FALSE)
  } else if (env_has(env, name)) {
    # Success case
    env
  } else {
    # Recursive case
    where(name, env_parent(env))
  }
}

7.4 Special Environments

Attaching packages changes the search path

  • The search path is the order in which R will look through environments for objects

  • Attached packages become a parent of the global environment

  • The immediate parent of the global environment is that last package attached

Attaching packages changes the search path

rlang::search_envs()
#> [[1]] $ <env: global>
#> [[2]] $ <env: package:stats>
#> [[3]] $ <env: package:graphics>
#> [[4]] $ <env: package:grDevices>
#> [[5]] $ <env: package:utils>
#> [[6]] $ <env: package:datasets>
#> [[7]] $ <env: package:methods>
#> [[8]] $ <env: Autoloads>
#> [[9]] $ <env: package:base>
library(rlang)

rlang::search_envs()
#>  [[1]] $ <env: global>
#>  [[2]] $ <env: package:rlang>
#>  [[3]] $ <env: package:stats>
#>  [[4]] $ <env: package:graphics>
#>  [[5]] $ <env: package:grDevices>
#>  [[6]] $ <env: package:utils>
#>  [[7]] $ <env: package:datasets>
#>  [[8]] $ <env: package:methods>
#>  [[9]] $ <env: Autoloads>
#> [[10]] $ <env: package:base>

Functions enclose their current environment

  • Functions enclose current environment when it is created
y <- 1

f <- function(x) x + y

rlang::fn_env(f)
#> <environment: R_GlobalEnv>

Functions enclose their current environment

  • g() is being bound by the environment e but binds the global environment

  • The function environment is the global environment but the binding environment is e

e <- env()

e$g <- function() 1

rlang::fn_env(e$g)
#> <environment: R_GlobalEnv>

Functions enclose their current environment

Namespaces ensure package environment independence

  • Every package has an underlying namespace

  • Every function is associated with a package environment and namespace environment

  • Package environments contain exported objects

  • Namespace environments contain exported and internal objects

Namespaces ensure package environment independence

Namespaces ensure package environment independence

Functions use ephemeral execution environments

  • Functions create a new environment to use whenever executed

  • The execution environment is a child of the function environment

  • Execution environments are garbage collected on function exit

Functions use ephemeral execution environments

7.5 Call stacks

The caller environment informs the call stack

  • The caller environment is the environment from which the function was called

  • Accessed with rlang::caller_env()

  • The call stack is created within the caller environment

f <- function(x) {
  g(x = 2)
}
g <- function(x) {
  h(x = 3)
}
h <- function(x) {
  lobstr::cst()
}
f(x = 1)
#>     ▆
#>  1. └─global f(x = 1)
#>  2.   └─global g(x = 2)
#>  3.     └─global h(x = 3)
#>  4.       └─lobstr::cst()

The caller environment informs the call stack

  • Call stack is more complicated with lazy evaluation
a <- function(x) b(x)
b <- function(x) d(x)
d <- function(x) x

a(f())
#>     ▆
#>  1. ├─global a(f())
#>  2. │ └─global b(x)
#>  3. │   └─global d(x)
#>  4. └─global f()
#>  5.   └─global g(x = 2)
#>  6.     └─global h(x = 3)
#>  7.       └─lobstr::cst()

The caller environment informs the call stack

R uses lexical scoping, not dynamic scoping

R uses lexical scoping: it looks up the values of names based on how a function is defined, not how it is called. “Lexical” here is not the English adjective that means relating to words or a vocabulary. It’s a technical CS term that tells us that the scoping rules use a parse-time, rather than a run-time structure. - Chapter 6 - functions

  • Dynamic scoping means functions use variables as they are defined in the calling environment

7.6 Data structures

Environments are useful data structures

  • Usecase include:

    1. Avoiding copies of large data

    2. Managing state within a package

    3. As a hashmap