Trade-offs

Learning objectives:

  • Understand the Trade-offs between S3, R6 and S4

  • Brief intro to S7 (the object system formerly known as R7)

Introduction

  • We have three OOP systems introduced so far (S3, S4, R6)

  • At the current time (pre - S7?) Hadley recommends use S3 by default: It’s simple and widely used throughout base R and CRAN.

  • If you have experience in other languages, Resist the temptation to use R6 even though it will feel more familiar!

S4 versus S3

Which functional object system to use, S3 or S4?

  • S3 is a simple and flexible system.

    • Good for small teams who need flexibility and immediate payoffs.

    • Commonly used throughout base R and CRAN

    • Flexibility can cause problems, more complex systems might require formal conventions

  • S4 is a more formal, strict system.

    • Good for large projects and large teams

    • Used by Bioconductor project

    • Requires significant up front investment in design, but payoff is a robust system that enforces conventions.

    • S4 documentation is challenging to use.

R6 versus S3

R6 is built on encapsulated objects, rather than generic functions.

Big differences: general trade-offs

  • A generic is a regular function so it lives in the global namespace. An R6 method belongs to an object so it lives in a local namespace. This influences how we think about naming.

  • R6’s reference semantics allow methods to simultaneously return a value and modify an object. This solves a painful problem called “threading state”.

  • You invoke an R6 method using $, which is an infix operator. If you set up your methods correctly you can use chains of method calls as an alternative to the pipe.

Namespacing

Where methods are found?

  • in S3, Generic functions are global and live in the global namespace

    • Advantage: Uniform API: summary, print, predict etc.

    • Disadvantage: Must be careful about creating new methods! Homonyms must be avoided, don’t define plot(bank_heist)

  • in R6, Encapsulated methods are local: objects with a scope

    • Advantage: No problems with homonyms: meaning of bank_heist$plot() is clear and unambiguous.

    • Disadvantage: Lack of a uniform API, except by convention.

Threading state

In S3 the challenge is to return a value and modify the object.

new_stack <- function(items = list()) {
  structure(list(items = items), class = "stack")
}

push <- function(x, y) {
  x$items <- c(x$items, list(y))
  x
}

No problem with that, but what about when we want to pop a value? We need to return two things.

pop <- function(x) {
  n <- length(x$items)
  
  item <- x$items[[n]]
  x$items <- x$items[-n]
  
  list(item = item, x = x)
}

The usage is a bit awkward:

s <- new_stack()
s <- push(s, 10)
s <- push(s, 20)

out <- pop(s)
# Update state:
s <- out$x

print(out$item)
#> [1] 20

In python and other languages we have structured binding to make this less awkward. R has the {zeallot} package. For more, see this vignette:

vignette('unpacking-assignment')

However, this is all easier in R6 due to the reference semantics!

Stack <- R6::R6Class("Stack", list(
  items = list(),
  push = function(x) {
    self$items <- c(self$items, x)
    invisible(self)
  },
  pop = function() {
    item <- self$items[[self$length()]]
    self$items <- self$items[-self$length()]
    item
  },
  length = function() {
    length(self$items)
  }
))

s <- Stack$new()
s$push(10)
s$push(20)
s$pop()
#> [1] 20

Method chaining

Useful to compose functions from left-to-right.

Use of the operators:

  • S3: |> or %>%

  • R6: $

s$push(44)$push(32)$pop()
#> [1] 32

Umm… what about S7 ?

https://xkcd.com/927/

Primary references:

S7 briefly

  • S7 is a ‘better’ version of S3 with some of the ‘strictness’ of S4.
"A little bit more complex then S3, with almost all of the features, 
all of the payoff of S4" - rstudio conf 2022, Hadley Wickham
  • S3 + S4 = S7

  • Compatible with S3: S7 objects are S3 objects! Can even extend an S3 object with S7

  • Somewhat compatible with S4, see compatability vignette for details.

  • Helpful error messages!

  • Note that it was previously called R7, but it was changed to “S7” to better reflect that it is functional not encapsulated!

Abbreviated introduction based on the vignette

To install (it’s now on CRAN):

install.packages("S7")
library(S7)
dog <- new_class("dog", properties = list(
  name = class_character,
  age = class_numeric
))
dog


#> <S7_class>
#> @ name  :  dog
#> @ parent: <S7_object>
#> @ properties:
#>  $ name: <character>          
#>  $ age : <integer> or <double>

Note the class_character, these are S7 classes corresponding to the base classes.

Now to use it to create an object of class dog:

lola <- dog(name = "Lola", age = 11)
lola

#> <dog>
#>  @ name: chr "Lola"
#>  @ age : num 11

Properties can be set/read with @, with automatic validation (‘safety rails’) based on the type!

lola@age <- 12
lola@age

#> 12

lola@age <- "twelve"

#> Error: <dog>@age must be <integer> or <double>, not <character>

Note the helpful error message!

Like S3 (and S4) S7 has generics, implemented with new_generic and method for particular methods:

speak <- new_generic("speak", "x")

method(speak, dog) <- function(x) {
  "Woof"
}
  
speak(lola)

#> [1] "Woof"

If we have another class, we can implement the generic for that too:

cat <- new_class("cat", properties = list(
  name = class_character,
  age = class_double
))
method(speak, cat) <- function(x) {
  "Meow"
}

fluffy <- cat(name = "Fluffy", age = 5)
speak(fluffy)

#> [1] "Meow"

Helpful messages:

speak

#> <S7_generic> speak(x, ...) with 2 methods:
#> 1: method(speak, cat)
#> 2: method(speak, dog)

“most usage of S7 with S3 will just work”

method(print, cat) <- function(...) {
  print("I am a cat.")
}

print(fluffy)
#> "I am a cat"

For validators, inheritance, dynamic properties and more, see the vignette!

So… switch to S7 ?

\[ \huge \textbf{Soon}^{tm} \]

  • Not yet… still in development! Lifecycle: experimental

  • But consider trying it out:

    • To stay ahead of the curve… S7 will be integrated into base R someday!

    • To contribute feedback to the S7 team!

    • To get “almost all” of the benefits of S4 without the complexity !

  • In particular, if you have a new project that might require the complexity of S4, consider S7 instead!

OOP system comparison

Characteristic S3 S4 S7 R6
Package base R base R S7 R6
Programming type Functional Functional Functional Encapulated
Complexity Low High Medium High
Payoff Low High High High
Team size Small Small-large Large ?
Namespace Global Global? Global? Local
Modify in place No No No Yes
Method chaining |> |>? |>? $
Get/set component $ @ @ $
Create class class() or structure() with class argument setClass() new_class() R6Class()
Create validator function() setValidity() or validator argument in setClass() validator argument in new_class() $validate()
Create generic UseMethod() setGeneric() new_generic() NA
Create method function() assigned to generic.method setMethod() method() R6Class()
Create object class() or structure() with class argument or constructor function new() Use registered method function $new()
Additional components attributes slots properties