Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

Advanced R Chapter 7: Environments

Daryn Ramsden

thisisdaryn at gmail dot com

last updated: 2020-07-26

1 / 36

Recall from Chapter 6: R has 4 primary scoping rules

  • Name masking

    • names defined inside a function mask names outside the function

    • if the name can't be found, R looks one level up

  • Functions versus variables

    • if you use a name in a function call, objects that are not functions get ignored in the search
  • A fresh start

    • every time a function is called, a new environment gets created.
  • Dynamic lookup

    • R looks for values only when it needs them (when the function is run)
2 / 36

The most important things you need to know

An environment's job is to bind a set of names to a set of values

Environments are different from lists in the following ways

  • Every name must be unique.

  • The names in an environment are not ordered.

  • An environment has a parent.

  • Environments are not copied when modified i.e. environments use reference semantics

3 / 36

7.2.1 Basics

  • You can can create environments using

    • rlang::env or

    • new.env

  • View an environment using:

    • rlang::env_print: descriptive info about environment elements

    • rlang::env_names: to give list of bindings

    • names: gives current bindings using base

4 / 36

Reading in some data

brewing_materials = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/brewing_materials.csv")
beer_taxed = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_taxed.csv")
brewer_size = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/brewer_size.csv")
beer_states = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_states.csv")
beer_reviews = readr::read_csv(here::here("data/beer_reviews.csv.gz"))
5 / 36

Creating another environment

beer_env <- env(
brewing_materials = brewing_materials,
beer_taxed = beer_taxed
)
env_print(beer_env)
## <environment: 0000000013491090>
## parent: <environment: global>
## bindings:
## * brewing_materials: <tibble>
## * beer_taxed: <tibble>
6 / 36

7.2.2 Important Environments

Two key environments:

  • current environment: where code is currently executing

  • global environment aka your workspace

These are often the same.

identical will tell you if two environments are the same

identical(global_env(), current_env())
## [1] TRUE
7 / 36

7.2.3 Parents

Every environment has a parent:

  • this is where R looks next to find names that are not bound in the current environment

  • the sequence of environments ends with the empty environment

  • can be set at the time of creation (first argument of rlang::env) . The current environment will be used otherwise.

  • You can get an environment's parent using rlang::env_parent or parent.env or a sequence of parents (ancestors?) using rlang::env_parents

8 / 36

Creating a new environment with specified parent

beer_env2 <- env(beer_env,
size = brewer_size,
states = beer_states)
env_parent(beer_env2)
## <environment: 0x0000000013491090>
parent.env(beer_env2)
## <environment: 0x0000000013491090>
9 / 36

Getting the sequence of parents for an environment

You can get the sequence of parents of an environment using rlang::env_parents

env_parents(beer_env2)
## [[1]] <env: 0000000013491090>
## [[2]] $ <env: global>
10 / 36

Getting the sequence of parents for an environment

You can get the sequence of parents of an environment using rlang::env_parents

env_parents(beer_env2)
## [[1]] <env: 0000000013491090>
## [[2]] $ <env: global>
env_parents(beer_env2, last = empty_env())
## [[1]] <env: 0000000013491090>
## [[2]] $ <env: global>
## [[3]] $ <env: package:rlang>
## [[4]] $ <env: package:xaringanthemer>
## [[5]] $ <env: package:stats>
## [[6]] $ <env: package:graphics>
## [[7]] $ <env: package:grDevices>
## [[8]] $ <env: package:utils>
## [[9]] $ <env: package:datasets>
## [[10]] $ <env: package:methods>
## [[11]] $ <env: Autoloads>
## [[12]] $ <env: package:base>
## [[13]] $ <env: empty>
10 / 36

A note about the Empty environment

It's empty because it has no names

My previous (incorrect) mental model: an environment is contained within its parent

My new (correct?) mental model: an environment remembers where its parent lives

11 / 36

7.2.4 Super-assignment

<<- never creates a new variable if the lhs name is not bound: it looks for a variable in a parent environment to modify.

x <- 0
f <- function() {
x <<- 1
}
f()
x
## [1] 1
12 / 36

7.2.5 Getting and setting

  • $ and [[ work similar to with lists

  • [[ cannot be used with numeric indices

  • [ does not work with environments

  • $ and [[ return NULL if the binding does not exist

  • binding a name to NULL does not remove it

13 / 36

Other functions

  • env_poke: adds a binding using a string and a value

  • env_bind: binds multiple values to a specified environment

  • env_has: determines if an environment contains a name (string input)

  • env_unbind: unbinds a given name (string input)

14 / 36

7.2.6 Advanced bindings

Two exotic variants of rlang::env_bind

  1. Delayed bindings: evaluated the first time they are accessed

  2. Active bindings: recomputed each time they are accessed

15 / 36

Delayed binding example

env_bind_lazy(current_env(), b = {Sys.sleep(1); 1})
system.time(print(b))
## [1] 1
## user system elapsed
## 0 0 1
system.time(print(b))
## [1] 1
## user system elapsed
## 0 0 0

both calls give the same output though executed at different times

16 / 36

Active binding example

env_bind_active(current_env(), z1 = function(val) runif(1))
z1
## [1] 0.008461749
z1
## [1] 0.9789811

Each output triggers a call to runif

17 / 36

7.3 Recursing Over Environments recursiely

How do we find which environment contains a name?

Recursive implementation

where <- function(name, env = caller_env()) {
if (identical(env, empty_env())) {
# Base case
stop("Can't find ", name, call. = FALSE)
} else if (env_has(env, name)) {
# Success case
env
} else {
# Recursive case
where(name, env_parent(env))
}
}
18 / 36

7.3 Recursing Over Environments recursiely

How do we find which environment contains a name? Iterative implementation

where2 <- function(in_name, env = caller_env()) {
while (!identical(env, empty_env())) {
if (env_has(env, in_name)) {
return(env)
}
# inspect parent
env <- env_parent(env)
}
return (env)
}
19 / 36

7.4.1 Package environments and the search path

  • Every attached package becomes a parent of the global environment

  • the most recently-attached package becomes the immediate parent of the global environment and links to the previous parent as its own parent

  • the search path is the sequence of environments containing all attached packages and continuing to the empty environment

  • the last two packages on the search path are always the Autoloads and base environments

20 / 36

7.4.2 The function environment

  • A function binds the current environment when it is created i.e. this becomes the environment that the function sees
  • A name is typically bound to a function on function creation (I think anonymous functions are going to )
  • The environment in which a name is bound to a function is not necessarily the environment that the function binds.

Example of a function binding the global environment while being bound to another environment

e <- env()
e$g <- function() 1
21 / 36

Accessing the environment of a function

We can use rlang::fn_env or environment to access the environment of a function:

y <- 1
f <- function(x) x + y
fn_env(f)
## <environment: R_GlobalEnv>
environment(f)
## <environment: R_GlobalEnv>
22 / 36

7.4.3 Namespaces

Question: How do we avoid ambiguities caused by varying the order of attaching packages?

Short answer: packages have a different sequence of parents

Longer answer: Each package has two environments:

  1. The package environment: accessible to the outside world

  2. A namespace environment: internal to the package

    • all bindings in the package environment are also found here

    • may have a few extra names

  3. names are bound to the function in both the package and namespace environments but the function specifically sees the namespace environment

23 / 36

Detailed example of namespaces and parents of namespaces

Consider sd:

User calls the function via the name in the package environment but the function uses the names defined in the namespace environment.

24 / 36

What are the parents of the namespace environment?

  • an imports environment: all the functions the package needs

  • the base environment

  • the global environment

25 / 36

7.4.4 Execution environments

We know about the function environment but there's also the ...

execution environment

  • created fresh when the function is called

  • its parent is the function environment

  • is ephemeral and will disappear unless you explicitly do something to save it

The following function will return the same result all the time even if called repeatedly:

g <- function(x) {
if (!env_has(current_env(), "a")) {
message("Defining a")
a <- 1
} else {
a <- a + 1
}
a
}
26 / 36

Preserving the execution environment

A few ways to preserve the execution environment

  • return it from the function
h2 <- function(x) {
a <- x * 2
current_env()
}
27 / 36

Preserving the execution environment

A few ways to preserve the execution environment

  • return it from the function
h2 <- function(x) {
a <- x * 2
current_env()
}
  • return an object with a binding to it e.g. a function created in the function will have the execution environment as its own function environment
plus <- function(x) {
function(y) x + y
}
plus_one <- plus(1)
plus_one
## function(y) x + y
## <environment: 0x0000000012f2f160>
27 / 36

7.5 Call Stacks

Another important environment, the caller environment:

  • the environment the function was called from

  • also where function values will be returned to

As functions can call each other, there can be multiple functions whose evaluation is in progress.

The collection of these caller environments is the call stack.

Call stacks can be:

  • simple (linear)

  • not simple: they may have multiple branches

28 / 36

7.5.1 Simple Call Stacks

An example of a simple call stack

f <- function(x) {
g(x = 2)
}
g <- function(x) {
h(x = 3)
}
h <- function(x) {
stop()
}
f(x = 1)
## Error in h(x = 3):
traceback()
## No traceback available
29 / 36

Another example of a simple call stack

h <- function(x) {
lobstr::cst()
}
f(x = 1)
## x
## 1. \-global::f(x = 1)
## 2. \-global::g(x = 2)
## 3. \-global::h(x = 3)
## 4. \-lobstr::cst()
30 / 36

7.5.2 Lazy Evaluation

Lazy evaluation can lead to multiple branches of the call stack

Example:

a <- function(x) b(x)
b <- function(x) c(x)
c <- function(x) x
a(f())
## x
## 1. +-global::a(f())
## 2. | \-global::b(x)
## 3. | \-global::c(x)
## 4. \-global::f()
## 5. \-global::g(x = 2)
## 6. \-global::h(x = 3)
## 7. \-lobstr::cst()
31 / 36

Summarizing call stack on previous slide

We can see from the traceback that

  • the function calls avoided evaluating f() all the way down until c really needed the value

  • came back to the global environment to evaluate f()

  • had to evaluate lobstr::csv, h, and g first

  • could then go back to c

32 / 36

7.5.3 Frames

Each tier of the call stack is called a frame i.e. each function in progress corresponds to a frame of the stack.

Each frame is characterized by:

  1. An expression, expr describing the function call

  2. An environment, env

    • usually the execution environment

    • the environment of the global frame is the global environment

    • using eval generates frames where the environment is a wildcard

  3. A parent, the previous call in the stack

33 / 36

7.5.4 A brief mention of Dynamic Scoping

  • R does not use dynamic scoping

  • But maybe it does somewhere

  • We'll see when we get to Chapter 20

34 / 36

7.6 As Data Structure

Some situations that environments are useful

  1. Avoiding copies of large data sets (because they use reference semantics)

  2. Managing state within a package

  1. As a hashmap: O(1) access guaranteed so use an environment if you don't want to implement your own hash table
35 / 36

Quiz Questions

  • List at least three ways that an environment differs from a list.

  • What is the parent of the global environment?

  • What is the only environment that doesn’t have a parent?

  • What is the enclosing environment of a function? Why is it important?

  • How do you determine the environment from which a function was called?

  • How are <- and <<- different?

36 / 36

Recall from Chapter 6: R has 4 primary scoping rules

  • Name masking

    • names defined inside a function mask names outside the function

    • if the name can't be found, R looks one level up

  • Functions versus variables

    • if you use a name in a function call, objects that are not functions get ignored in the search
  • A fresh start

    • every time a function is called, a new environment gets created.
  • Dynamic lookup

    • R looks for values only when it needs them (when the function is run)
2 / 36
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow