Name masking
names defined inside a function mask names outside the function
if the name can't be found, R looks one level up
Functions versus variables
A fresh start
Dynamic lookup
Every name must be unique.
The names in an environment are not ordered.
An environment has a parent.
Environments are not copied when modified i.e. environments use reference semantics
You can can create environments using
rlang::env
or
new.env
View an environment using:
rlang::env_print
: descriptive info about environment elements
rlang::env_names
: to give list of bindings
names
: gives current bindings using base
brewing_materials = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/brewing_materials.csv")beer_taxed = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_taxed.csv")brewer_size = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/brewer_size.csv")beer_states = readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_states.csv")beer_reviews = readr::read_csv(here::here("data/beer_reviews.csv.gz"))
beer_env <- env( brewing_materials = brewing_materials, beer_taxed = beer_taxed)env_print(beer_env)
## <environment: 0000000013491090>## parent: <environment: global>## bindings:## * brewing_materials: <tibble>## * beer_taxed: <tibble>
Two key environments:
current environment: where code is currently executing
global environment aka your workspace
These are often the same.
identical
will tell you if two environments are the same
identical(global_env(), current_env())
## [1] TRUE
Every environment has a parent:
this is where R looks next to find names that are not bound in the current environment
the sequence of environments ends with the empty environment
can be set at the time of creation (first argument of rlang::env
) . The current environment will be used otherwise.
You can get an environment's parent using rlang::env_parent
or parent.env
or a sequence of parents (ancestors?) using rlang::env_parents
beer_env2 <- env(beer_env, size = brewer_size, states = beer_states)env_parent(beer_env2)
## <environment: 0x0000000013491090>
parent.env(beer_env2)
## <environment: 0x0000000013491090>
You can get the sequence of parents of an environment using rlang::env_parents
env_parents(beer_env2)
## [[1]] <env: 0000000013491090>## [[2]] $ <env: global>
You can get the sequence of parents of an environment using rlang::env_parents
env_parents(beer_env2)
## [[1]] <env: 0000000013491090>## [[2]] $ <env: global>
env_parents(beer_env2, last = empty_env())
## [[1]] <env: 0000000013491090>## [[2]] $ <env: global>## [[3]] $ <env: package:rlang>## [[4]] $ <env: package:xaringanthemer>## [[5]] $ <env: package:stats>## [[6]] $ <env: package:graphics>## [[7]] $ <env: package:grDevices>## [[8]] $ <env: package:utils>## [[9]] $ <env: package:datasets>## [[10]] $ <env: package:methods>## [[11]] $ <env: Autoloads>## [[12]] $ <env: package:base>## [[13]] $ <env: empty>
It's empty because it has no names
My previous (incorrect) mental model: an environment is contained within its parent
My new (correct?) mental model: an environment remembers where its parent lives
<<- never creates a new variable if the lhs name is not bound: it looks for a variable in a parent environment to modify.
x <- 0f <- function() { x <<- 1}f()x
## [1] 1
$
and [[
work similar to with lists
[[
cannot be used with numeric indices
[
does not work with environments
$
and [[
return NULL if the binding does not exist
binding a name to NULL does not remove it
env_poke
: adds a binding using a string and a value
env_bind
: binds multiple values to a specified environment
env_has
: determines if an environment contains a name (string input)
env_unbind
: unbinds a given name (string input)
Two exotic variants of rlang::env_bind
Delayed bindings: evaluated the first time they are accessed
Active bindings: recomputed each time they are accessed
env_bind_lazy(current_env(), b = {Sys.sleep(1); 1})system.time(print(b))
## [1] 1
## user system elapsed ## 0 0 1
system.time(print(b))
## [1] 1
## user system elapsed ## 0 0 0
both calls give the same output though executed at different times
env_bind_active(current_env(), z1 = function(val) runif(1))z1
## [1] 0.008461749
z1
## [1] 0.9789811
Each output triggers a call to runif
How do we find which environment contains a name?
Recursive implementation
where <- function(name, env = caller_env()) { if (identical(env, empty_env())) { # Base case stop("Can't find ", name, call. = FALSE) } else if (env_has(env, name)) { # Success case env } else { # Recursive case where(name, env_parent(env)) }}
How do we find which environment contains a name? Iterative implementation
where2 <- function(in_name, env = caller_env()) { while (!identical(env, empty_env())) { if (env_has(env, in_name)) { return(env) } # inspect parent env <- env_parent(env) } return (env)}
Every attached package becomes a parent of the global environment
the most recently-attached package becomes the immediate parent of the global environment and links to the previous parent as its own parent
the search path is the sequence of environments containing all attached packages and continuing to the empty environment
the last two packages on the search path are always the Autoloads
and base
environments
e <- env()e$g <- function() 1
We can use rlang::fn_env
or environment
to access the environment of a function:
y <- 1f <- function(x) x + yfn_env(f)
## <environment: R_GlobalEnv>
environment(f)
## <environment: R_GlobalEnv>
Question: How do we avoid ambiguities caused by varying the order of attaching packages?
Short answer: packages have a different sequence of parents
Longer answer: Each package has two environments:
The package environment: accessible to the outside world
A namespace environment: internal to the package
all bindings in the package environment are also found here
may have a few extra names
names are bound to the function in both the package and namespace environments but the function specifically sees the namespace environment
Consider sd
:
User calls the function via the name in the package environment but the function uses the names defined in the namespace environment.
an imports
environment: all the functions the package needs
the base
environment
the global environment
We know about the function environment but there's also the ...
execution environment
created fresh when the function is called
its parent is the function environment
is ephemeral and will disappear unless you explicitly do something to save it
The following function will return the same result all the time even if called repeatedly:
g <- function(x) { if (!env_has(current_env(), "a")) { message("Defining a") a <- 1 } else { a <- a + 1 } a}
A few ways to preserve the execution environment
h2 <- function(x) { a <- x * 2 current_env()}
A few ways to preserve the execution environment
h2 <- function(x) { a <- x * 2 current_env()}
plus <- function(x) { function(y) x + y}plus_one <- plus(1)plus_one
## function(y) x + y## <environment: 0x0000000012f2f160>
Another important environment, the caller environment:
the environment the function was called from
also where function values will be returned to
As functions can call each other, there can be multiple functions whose evaluation is in progress.
The collection of these caller environments is the call stack.
Call stacks can be:
simple (linear)
not simple: they may have multiple branches
An example of a simple call stack
f <- function(x) { g(x = 2)}g <- function(x) { h(x = 3)}h <- function(x) { stop()}
f(x = 1)
## Error in h(x = 3):
traceback()
## No traceback available
h <- function(x) { lobstr::cst()}f(x = 1)
## x## 1. \-global::f(x = 1)## 2. \-global::g(x = 2)## 3. \-global::h(x = 3)## 4. \-lobstr::cst()
Lazy evaluation can lead to multiple branches of the call stack
Example:
a <- function(x) b(x)b <- function(x) c(x)c <- function(x) xa(f())
## x## 1. +-global::a(f())## 2. | \-global::b(x)## 3. | \-global::c(x)## 4. \-global::f()## 5. \-global::g(x = 2)## 6. \-global::h(x = 3)## 7. \-lobstr::cst()
We can see from the traceback that
the function calls avoided evaluating f()
all the way down until c
really needed the value
came back to the global environment to evaluate f()
had to evaluate lobstr::csv
, h
, and g
first
could then go back to c
Each tier of the call stack is called a frame i.e. each function in progress corresponds to a frame of the stack.
Each frame is characterized by:
An expression, expr describing the function call
An environment, env
usually the execution environment
the environment of the global frame is the global environment
using eval
generates frames where the environment is a wildcard
A parent, the previous call in the stack
R does not use dynamic scoping
But maybe it does somewhere
We'll see when we get to Chapter 20
Some situations that environments are useful
Avoiding copies of large data sets (because they use reference semantics)
Managing state within a package
List at least three ways that an environment differs from a list.
What is the parent of the global environment?
What is the only environment that doesn’t have a parent?
What is the enclosing environment of a function? Why is it important?
How do you determine the environment from which a function was called?
How are <- and <<- different?
Name masking
names defined inside a function mask names outside the function
if the name can't be found, R looks one level up
Functions versus variables
A fresh start
Dynamic lookup
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |