+ - 0:00:00
Notes for current slide
Notes for next slide

Advanced R by Hadley Wickham

Chapter 2: Names and Values

Asmae Toumi

@asmae_toumi

2020-07-26

1 / 13

What's in Chapter 2:

2 / 13

What's in Chapter 2:

  • Section 2.2: distinction between names and values
2 / 13

What's in Chapter 2:

  • Section 2.2: distinction between names and values

  • Section 2.3: describes when R makes a copy

2 / 13

What's in Chapter 2:

  • Section 2.2: distinction between names and values

  • Section 2.3: describes when R makes a copy

  • Section 2.4: explores how much memory an object occupies

2 / 13

What's in Chapter 2:

  • Section 2.2: distinction between names and values

  • Section 2.3: describes when R makes a copy

  • Section 2.4: explores how much memory an object occupies

  • Section 2.5: describes the two important exceptions to copy-on-modify

2 / 13

What's in Chapter 2:

  • Section 2.2: distinction between names and values

  • Section 2.3: describes when R makes a copy

  • Section 2.4: explores how much memory an object occupies

  • Section 2.5: describes the two important exceptions to copy-on-modify

  • Section 2.6: concludes the chapter with a discussion of the garbage collector

2 / 13

What's in Chapter 2:

  • Section 2.2: distinction between names and values

  • Section 2.3: describes when R makes a copy

  • Section 2.4: explores how much memory an object occupies

  • Section 2.5: describes the two important exceptions to copy-on-modify

  • Section 2.6: concludes the chapter with a discussion of the garbage collector

Prerequisites

To understand how R represents objects, we'll need to install the lobstr package:

library(lobstr)
# We also install emo because it's fun!
# devtools::install_github("hadley/emo")
2 / 13

Binding basics

How would you read the following?

x <- c(1, 2, 3)
3 / 13

Binding basics

How would you read the following?

x <- c(1, 2, 3)
  • Create an object named ‘x’, containing the values 1, 2, and 3: 💩
3 / 13

Binding basics

How would you read the following?

x <- c(1, 2, 3)
  • Create an object named ‘x’, containing the values 1, 2, and 3: 💩

  • It’s creating an object, a vector of values (1, 2, 3) and it’s binding that object to a name, x : 😄

3 / 13

Copy-on-modify

x <- c(1, 2, 3)
x
## [1] 1 2 3

How can we see what's happening under the hood?

4 / 13

Copy-on-modify

x <- c(1, 2, 3)
x
## [1] 1 2 3

How can we see what's happening under the hood?

You can call obj_address() to see this object's identifier:

obj_addr(x)
## [1] "0x125d01a8"
4 / 13

Copy-on-modify

x <- c(1, 2, 3)
x
## [1] 1 2 3

How can we see what's happening under the hood?

You can call obj_address() to see this object's identifier:

obj_addr(x)
## [1] "0x125d01a8"
y <- x
4 / 13

Copy-on-modify

x <- c(1, 2, 3)
x
## [1] 1 2 3

How can we see what's happening under the hood?

You can call obj_address() to see this object's identifier:

obj_addr(x)
## [1] "0x125d01a8"
y <- x
obj_addr(y)
## [1] "0x125d01a8"
4 / 13

What happens to x when you modify y ?

y[[3]] <- 4
5 / 13

What happens to x when you modify y ?

y[[3]] <- 4
x
## [1] 1 2 3
5 / 13

What happens to x when you modify y ?

y[[3]] <- 4
x
## [1] 1 2 3
  • Changing y did not modify x.
5 / 13

What happens to x when you modify y ?

y[[3]] <- 4
x
## [1] 1 2 3
  • Changing y did not modify x.
  • This is due to a behavior called copy-on-modify.
5 / 13

What happens to x when you modify y ?

y[[3]] <- 4
x
## [1] 1 2 3
  • Changing y did not modify x.
  • This is due to a behavior called copy-on-modify.
obj_addr(x)
## [1] "0x125d01a8"
obj_addr(y)
## [1] "0x13293038"
5 / 13

What about functions?

The same copy-on-modify behavior applies for functions.

6 / 13

What about functions?

The same copy-on-modify behavior applies for functions.

We can use tracemem() to track when an object gets copied. It allows us to do that because every time an object gets copied, a message containing the address of the object will be printed.

6 / 13

What about functions?

The same copy-on-modify behavior applies for functions.

We can use tracemem() to track when an object gets copied. It allows us to do that because every time an object gets copied, a message containing the address of the object will be printed.

f <- function(a) {
a
}
x <- c(1, 2, 3)
cat(tracemem(x), "\n")
## <000000001E0C8120>
6 / 13

What about functions?

The same copy-on-modify behavior applies for functions.

We can use tracemem() to track when an object gets copied. It allows us to do that because every time an object gets copied, a message containing the address of the object will be printed.

f <- function(a) {
a
}
x <- c(1, 2, 3)
cat(tracemem(x), "\n")
## <000000001E0C8120>
z <- f(x)

We got no message here, which means no new copy was generated.

6 / 13

What about functions?

The same copy-on-modify behavior applies for functions.

We can use tracemem() to track when an object gets copied. It allows us to do that because every time an object gets copied, a message containing the address of the object will be printed.

f <- function(a) {
a
}
x <- c(1, 2, 3)
cat(tracemem(x), "\n")
## <000000001E0C8120>
z <- f(x)

We got no message here, which means no new copy was generated.

If f did modify x, then a new copy would get generated and thus a message would get printed by tracemem().

6 / 13

Lists

Like vectors, lists also use copy-on-modify behaviour.

7 / 13

Lists

Like vectors, lists also use copy-on-modify behaviour.

list_1 <- list(1, 2, 3)
list_2 <- list_1
7 / 13

Lists

Like vectors, lists also use copy-on-modify behaviour.

list_1 <- list(1, 2, 3)
list_2 <- list_1
obj_addr(list_1)
## [1] "0x121f6fc0"
obj_addr(list_2)
## [1] "0x121f6fc0"
7 / 13

Lists

Like vectors, lists also use copy-on-modify behaviour.

list_1 <- list(1, 2, 3)
list_2 <- list_1
obj_addr(list_1)
## [1] "0x121f6fc0"
obj_addr(list_2)
## [1] "0x121f6fc0"
list_2[[3]] <- 4
obj_addr(list_2)
## [1] "0x1e482598"
7 / 13

Lists (continued)

We can use lobstr::ref() to print the memory address of each object along with a local ID so that we can easily cross-reference shared components.

8 / 13

Lists (continued)

We can use lobstr::ref() to print the memory address of each object along with a local ID so that we can easily cross-reference shared components.

ref(list_1, list_2)
## o [1:0x121f6fc0] <list>
## +-[2:0x1db8e278] <dbl>
## +-[3:0x1db8e240] <dbl>
## \-[4:0x1db8e208] <dbl>
##
## o [5:0x1e482598] <list>
## +-[2:0x1db8e278]
## +-[3:0x1db8e240]
## \-[6:0x1e289378] <dbl>

This shows that list_1 and list_2 have shared components, namely integers 2 and 3 corresponding to the 2nd and 3rd element in their vectors.

8 / 13

Data Frames

Data frames are lists of vectors.

9 / 13

Data Frames

Data frames are lists of vectors.

  • If you modify a column:

    • only that column needs to be modified

    • the others will still point to their original references:

  • If you modify a row:

    • every column is modified

    • every column must be copied:

9 / 13

Character Vectors

Consider this character vector:

x <- c("marco", "polo", "marco", "polo")
ref(x, character = T)
## o [1:0x13428ae0] <chr>
## +-[2:0x137efa98] <string: "marco">
## +-[3:0x137ef9f0] <string: "polo">
## +-[2:0x137efa98]
## \-[3:0x137ef9f0]

This is called a global string pool where each element of a character vector is a pointer to a unique string in the pool. This has implications for how much memory a character vector uses. To find out, use lobstr::obj_size()

10 / 13

Modify-in-place (1)

Modyfing an R object usually creates a copy. Exceptions:

11 / 13

Modify-in-place (1)

Modyfing an R object usually creates a copy. Exceptions:

  • objects with a single binding
11 / 13

Modify-in-place (1)

Modyfing an R object usually creates a copy. Exceptions:

  • objects with a single binding
  • Environments, a special type of object, are always modified in place (more on this in Chapter 7)
11 / 13

Modify-in-place (2)

Implication: create functions that “remember” their previous state (more on this in Chapter 16)

12 / 13

Unbinding / Garbage collector

13 / 13

Unbinding / Garbage collector

  • Objects get deleted thanks to the garbage collector (GC)
13 / 13

Unbinding / Garbage collector

  • Objects get deleted thanks to the garbage collector (GC)
  • GC frees up memory by deleting R objects that are no longer used
13 / 13

Unbinding / Garbage collector

  • Objects get deleted thanks to the garbage collector (GC)
  • GC frees up memory by deleting R objects that are no longer used
  • GC runs automatically whenever R needs more memory to create a new object.
13 / 13

Unbinding / Garbage collector

  • Objects get deleted thanks to the garbage collector (GC)
  • GC frees up memory by deleting R objects that are no longer used
  • GC runs automatically whenever R needs more memory to create a new object.
  • There is no reason to call gc() yourself unless you want to:
    • ask R to return memory to your operating system so other programs can use it, or
    • to know how much memory is currently being used
13 / 13

What's in Chapter 2:

2 / 13
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow