R6

Learning objectives:

  • Discuss how to construct a R6 class.
  • Overview the different mechanisms of a R6 class (e.g. initialization, print, public, private, and active fields and methods).
  • Observe various examples using R6’s mechanisms to create R6 classes, objects, fields, and methods.
  • Observe the consequences of R6’s reference semantics.
  • Review the book’s arguments on the use of R6 over reference classes.

A review of OOP

  • A PIE

Introducing R6

  • R6 classes are not built into base.
    • It is a separate package.
    • You have to install and attach to use.
    • If R6 objects are used in a package, it needs to be specified as a dependency in the DESCRIPTION file.
install.packages("R6")
library(R6)
  • R6 classes have two special properties:
    1. Uses an encapsulated OOP paradigm.
      • Methods belong to objects, not generics.
      • They follow the form object$method() for calling fields and methods.
    2. R6 objects are mutable.
      • Modified in place.
      • They follow reference semantics.
  • R6 is similar to OOP in other languages.
  • However, its use can lead ton non-idiomatic R code.
    • Tradeoffs - follows an OOP paradigm but sacrafice what users are use to.
    • Microsoft365R.

Constructing an R6 class, the basics

  • Really simple to do, just use the R6::R6Class() function.
Accumulator <- R6Class("Accumulator", list(
  sum = 0,
  add = function(x = 1) {
    self$sum <- self$sum + x
    invisible(self)
  }
))
  • Two important arguments:
    1. classname - A string used to name the class (not needed but suggested)
    2. public - A list of methods (functions) and fields (anything else)
  • Suggested style conventions to follow:
    • Class name should follow UpperCamelCase.
    • Methods and fields should use snake_case.
    • Always assign the result of a R6Class() into a variable with the same name as the class.
  • You can use self$ to access methods and fields of the current object.

Constructing an R6 object

  • Just use $new()
x <- Accumulator$new()
x$add(4)
x$sum
#> [1] 4

R6 objects and method chaining

  • All side-effect R6 methods should return self invisibly.
  • This allows for method chaining.
x$add(10)$add(10)$sum
# [1] 24
  • To improve readability:
# Method chaining
x$
  add(10)$
  add(10)$
  sum
# [1] 44

R6 useful methods

  • $print() - Modifies the default printing method.
    • $print() should always return invisible(self).
  • $initialize() - Overides the default behaviour of $new().
    • Also provides a space to validate inputs.

Constructing a bank account class

BankAccount <- R6Class("BankAccount", list(
  owner = NULL,
  type = NULL,
  balance = 0,
  initialize = function(owner, type) {
    stopifnot(is.character(owner), length(owner) == 1)
    stopifnot(is.character(type), length(type) == 1)
  },
  deposit = function(amount) {
    self$balance <- self$balance + amount
    invisible(self)
  },
  withdraw = function(amount) {
    self$balance <- self$balance - amount
    invisible(self)
  }
))

Simple transactions

collinsavings <- BankAccount$new("Collin", type = "Savings")
collinsavings$deposit(10)
collinsavings
#> <BankAccount>
#>   Public:
#>     balance: 10
#>     clone: function (deep = FALSE) 
#>     deposit: function (amount) 
#>     initialize: function (owner, type) 
#>     owner: NULL
#>     type: NULL
#>     withdraw: function (amount)
collinsavings$withdraw(10)
collinsavings
#> <BankAccount>
#>   Public:
#>     balance: 0
#>     clone: function (deep = FALSE) 
#>     deposit: function (amount) 
#>     initialize: function (owner, type) 
#>     owner: NULL
#>     type: NULL
#>     withdraw: function (amount)

Modifying the $print() method

BankAccount <- R6Class("BankAccount", list(
  owner = NULL,
  type = NULL,
  balance = 0,
  initialize = function(owner, type) {
    stopifnot(is.character(owner), length(owner) == 1)
    stopifnot(is.character(type), length(type) == 1)

    self$owner <- owner
    self$type <- type
  },
  deposit = function(amount) {
    self$balance <- self$balance + amount
    invisible(self)
  },
  withdraw = function(amount) {
    self$balance <- self$balance - amount
    invisible(self)
  },
  print = function(...) {
    cat("Account owner: ", self$owner, "\n", sep = "")
    cat("Account type: ", self$type, "\n", sep = "")
    cat("  Balance: ", self$balance, "\n", sep = "")
    invisible(self)
  }
))
  • Important point: Methods are bound to individual objects.
    • Reference semantics vs. copy-on-modify.
collinsavings

hadleychecking <- BankAccount$new("Hadley", type = "Checking")

hadleychecking

How does this work?

Adding methods after class creation

  • Use $set() to add methods after creation.
  • Keep in mind methods added with $set() are only available with new objects.
Accumulator <- R6Class("Accumulator")
Accumlator$set("public", "sum", 0)
Accumulator$set("public", "add", function(x = 1) {
  self$sum <- self$sum + x
  invisible(self)
})

Inheritance

  • To inherit behaviour from an existing class, provide the class object via the inherit argument.
  • This example also provides a good example on how to debug an R6 class.
BankAccountOverDraft <- R6Class("BankAccountOverDraft",
  inherit = BankAccount,
  public = list(
    withdraw = function(amount) {
      if ((self$balance - amount) < 0) {
        stop("Overdraft")
      }
      # self$balance() <- self$withdraw()
      self$balance <- self$balance - amount
      invisible(self)
    }
  )
)

Future instances debugging

BankAccountOverDraft$debug("withdraw")
x <- BankAccountOverDraft$new("x", type = "Savings")
x$withdraw(20)

# Turn debugging off
BankAccountOverDraft$undebug("withdraw")

Individual object debugging

  • Use the debug() function.
x <- BankAccountOverDraft$new("x", type = "Savings")
# Turn on debugging
debug(x$withdraw)
x$withdraw(10)

# Turn off debugging
undebug(x$withdraw)
x$withdraw(5)

Test out our debugged class

collinsavings <- BankAccountOverDraft$new("Collin", type = "Savings")
collinsavings
collinsavings$withdraw(10)
collinsavings
collinsavings$deposit(5)
collinsavings
collinsavings$withdraw(5)

Introspection

  • Every R6 object has an S3 class that reflects its hierarchy of R6 classes.
  • Use the class() function to determine class (and all classes it inherits from).
class(collinsavings)
  • You can also list all methods and fields of an R6 object with names().
names(collinsavings)

Controlling access

  • R6 provides two other arguments:
    • private - create fields and methods only available from within the class.
    • active - allows you to use accessor functions to define dynamic or active fields.

Privacy

  • Private fields and methods - elements that can only be accessed from within the class, not from the outside.
  • We need to know two things to use private elements:
    1. private’s interface is just like public’s interface.
      • List of methods (functions) and fields (everything else).
    2. You use private$ instead of self$
      • You cannot access private fields or methods outside of the class.
  • Why might you want to keep your methods and fields private?
    • You’ll want to be clear what is ok for others to access, especially if you have a complex system of classes.
    • It’s easier to refactor private fields and methods, as you know others are not relying on it.

Active fields

  • Active fields allow you to define components that look like fields from the outside, but are defined with functions, like methods.
  • Implemented using active bindings.
  • Each active binding is a function that takes a single argument value.
  • Great when used in conjunction with private fields.
    • This allows for additional checks.
    • For example, we can use them to make a read-only field and to validate inputs.

Adding a read-only bank account number

BankAccount <- R6Class("BankAccount", public = list(
  owner = NULL,
  type = NULL,
  balance = 0,
  initialize = function(owner, type, acct_num = NULL) {
    private$acct_num <- acct_num
    self$owner <- owner
    self$type <- type
  },
  deposit = function(amount) {
    self$balance <- self$balance + amount
    invisible(self)
  },
  withdraw = function(amount) {
    self$balance <- self$balance - amount
    invisible(self)
  },
  print = function(...) {
    cat("Account owner: ", self$owner, "\n", sep = "")
    cat("Account type: ", self$type, "\n", sep = "")
    cat("Account #: ", private$acct_num, "\n", sep = "")
    cat("  Balance: ", self$balance, "\n", sep = "")
    invisible(self)
  }
  ),
  private = list(
    acct_num = NULL
  ),
  active = list(
    create_acct_num = function(value) {
      if (is.null(private$acct_num)) {
        private$acct_num <- ids::uuid()
      } else {
        stop("`$acct_num` already assigned")
      }
    }
  )
)
collinsavings <- BankAccount$new("Collin", type = "Savings")
collinsavings$create_acct_num
# Stops because account number is assigned
collinsavings$create_acct_num()
collinsavings$print()

How does an active field work?

  • Not sold on this, as I don’t know if active gets its own environment.
    • Any ideas?

Reference semantics

  • Big difference to note about R6 objects in relation to other objects:
    • R6 objects have reference semantics.
  • The primary consequence of reference semantics is that objects are not copied when modified.
  • If you want to copy an R6 object, you need to use $clone.
  • There are some other less obvious consequences:
    • It’s harder to reason about code that uses R6 objects, as you need more context.
    • Think about when an R6 object is deleted, you can use $finalize() to clean up after yourself.
    • If one of the fields is an R6 object, you must create it inside $initialize(), not R6Class()

R6 makes it harder to reason about code

  • Reference semantics makes code harder to reason about.
x <- list(a = 1)
y <- list(b = 2)

# Here we know the final line only modifies z
z <- f(x, y)

# vs.

x <- List$new(a = 1)
y <- List$new(b = 2)

# If x or y is a method, we don't know if it modifies
# something other than z. Is this a limitation of
# abstraction?
z <- f(x, y)
  • I understand the basics, but not necessarily the tradeoffs.
    • Anyone care to fill me in?
    • Is this a limitation of abstraction?

Better sense of what’s going on by looking at a finalizer

  • Since R6 objects are not copied-on-modified, so they are only deleted once.
  • We can use this characteristic to complement our $initialize() with a $finalize() method.
    • i.e., to clean up after we delete an R6 object.
    • This could be a way to close a database connection.
TemporaryFile <- R6Class("TemporaryFile", list(
  path = NULL,
  initialize = function() {
    self$path <- tempfile()
  },
  finalize = function() {
    message("Cleaning up ", self$path)
    unlink(self$path)
  }
))
tf <- TemporaryFile$new()
# The finalizer will clean up, once the R6 object is deleted.
rm(tf)

Consequences of R6 fields

  • If you use an R6 class as the default value of a field, it will be shared across all instances of the object.
TemporaryDatabase <- R6Class("TemporaryDatabase", list(
  con = NULL,
  file = TemporaryFile$new(),
  initialize = function() {
    self$con <- DBI::dbConnect(RSQLite::SQLite(), path = file$path)
  },
  finalize = function() {
    DBI::dbDisconnect(self$con)
  }
))

db_a <- TemporaryDatabase$new()
db_b <- TemporaryDatabase$new()

db_a$file$path == db_b$file$path
#> [1] TRUE
  • To fix this, we need to move the class method call to $intialize()
TemporaryDatabase <- R6Class("TemporaryDatabase", list(
  con = NULL,
  file = NULL,
  initialize = function() {
    self$file <- TemporaryFile$new()
    self$con <- DBI::dbConnect(RSQLite::SQLite(), path = file$path)
  },
  finalize = function() {
    DBI::dbDisconnect(self$con)
  }
))

db_a <- TemporaryDatabase$new()
db_b <- TemporaryDatabase$new()

db_a$file$path == db_b$file$path
#> [1] FALSE

Why use R6?

  • Book mentions R6 is similar to the built-in reference classes.
  • Then why use R6?
  • R6 is simpler.
    • RC requires you to understand S4.
  • Comprehensive documentation.
  • Simpler mechanisms for cross-package subclassing, which just works.
  • R6 separates public and private fields in separate environments, RC stacks everything in the same environment.
  • R6 is faster.
  • RC is tied to R, so any bug fixes need a newer version of R.
    • This is especially important if you’re writing packages that need to work with multiple R versions.
  • R6 and RC are similar, so if you need RC, it will only require a small amount of additional effort to learn RC.