+ - 0:00:00
Notes for current slide
Notes for next slide

Chapter 15: S4

Pavitra Chakravarty

R4DS Reading Group

1 / 25

What are S4 classes?

S4 provides a formal approach to functional OOP. An important new component of S4 is the slot, a named component of the object that is accessed using the specialised subsetting operator @.

2 / 25

The salient features of S4 are as follows.

  • methods Package: All functions related to S4 live in the methods package. This package is always available when you're running R interactively

    • methods::new(), methods::setClass(), methods::setGeneric(), methods::setMethod()
  • Accessor Functions: Enable you to safely get/set slot values

    • methods::setGeneric(), method::getGeneric()
  • Class Defintion: Defintion of class using three arguments

    • class name
    • Named character vector with names and classes of slots: c(name = "character", age = "numeric")
    • prototype with list of default values for each slot
3 / 25

Basics of S4

Creation of class, creations of objects of that class, determine class type, acess slots

setClass("Person",
slots = c(
name = "character",
age = "numeric"
)
)
nostradamus <- new("Person", name = "Nostradamus", age = NA_real_)

Necessity of accessor functions

is(nostradamus)
nostradamus@name
slot(nostradamus, "age")
[1] "Person"
[1] "Nostradamus"
[1] NA
4 / 25

setGeneric and setMethod

Creating setter/getter functions by creating generics with setGeneric():

setGeneric("age", function(x) standardGeneric("age"))
setGeneric("age<-", function(x, value) standardGeneric("age<-"))
[1] "age"
[1] "age<-"

And then defining methods with setMethod():

setMethod("age", "Person", function(x) x@age)
setMethod("age<-", "Person", function(x, value) {
x@age <- value
x
})
age(nostradamus) <- 50
age(nostradamus)
[1] 50
5 / 25

Classes

The class name

A named character vector that describes the names and classes of the slots (fields)

A prototype, a list of default values for each slot. Optional, but better to do it

setClass("Person",
slots = c(
name = "character",
age = "numeric"
),
prototype = list(
name = NA_character_,
age = NA_real_
)
)
me <- new("Person", name = "Nostradamus")
str(me)
Formal class 'Person' [package ".GlobalEnv"] with 2 slots
..@ name: chr "Nostradamus"
..@ age : num NA
6 / 25

Inheritance

There is one other important argument to setClass(): contains. This specifies a class (or classes) to inherit slots and behaviour from. For example, we can create an Employee class that inherits from the Person class, adding an extra slot that describes their boss.

setClass("Employee",
contains = "Person",
slots = c(
boss = "Person"
),
prototype = list(
boss = new("Person")
)
)
str(new("Employee"))
Show in New WindowClear OutputExpand/Collapse Output
[1] 50
Show in New WindowClear OutputExpand/Collapse Output
Formal class 'Person' [package ".GlobalEnv"] with 2 slots
..@ name: chr "Nostradamus"
..@ age : num NA
Show in New WindowClear OutputExpand/Collapse Output
Formal class 'Employee' [package ".GlobalEnv"] with 3 slots
..@ boss:Formal class 'Person' [package ".GlobalEnv"] with 2 slots
.. .. ..@ name: chr NA
.. .. ..@ age : num NA
..@ name: chr NA
..@ age : num NA

To determine what classes an object inherits from, use is():

is(new("Person"))
is(new("Employee"))
[1] "Person"
[1] "Employee" "Person"

To test if an object inherits from a specific class, use the second argument of is():

is(nostradamus, "person")
[1] FALSE
7 / 25

Helper and Validator

8 / 25

Helper

new() is a low-level constructor for use by the developer. User-facing classes should always be paired with a user-friendly helper. A helper should always:

Have the same name as the class, e.g. Person().

Finish by calling methods::new().

Person <- function(name, age = NA) {
age <- as.double(age)
new("Person", name = name, age = age)
}
Person("Nostradamus")
An object of class "Person"
Slot "name":
[1] "Nostradamus"
Slot "age":
[1] NA
9 / 25

Validator

The constructor automatically checks that the slots have correct classes. However, we might want to test that all slots have the same length as we want to store info about multiple people

Person("Nostradamus", age = c(30, 37))
An object of class "Person"
Slot "name":
[1] "Nostradamus"
Slot "age":
[1] 30 37
10 / 25

To enforce these additional constraints we write a validator with setValidity(). It takes a class and a function that returns TRUE if the input is valid, and otherwise returns a character vector describing the problem(s):

setValidity("Person", function(object) {
if (length(object@name) != length(object@age)) {
"@name and @age must be same length"
} else {
TRUE
}
})
Person("Nostradamus", age = c(30, 37))
Class "Person" [in ".GlobalEnv"]
Slots:
Name: name age
Class: character numeric
Known Subclasses: "Employee"
Error in validObject(.Object) : invalid class “Person” object: @name and @age must be same length
11 / 25

Generics and Methods

The job of a generic is to perform method dispatch, i.e. find the specific implementation for the combination of classes passed to the generic. To create a new S4 generic, call setGeneric() with a function that calls standardGeneric():

setGeneric("myGeneric", function(x) standardGeneric("myGeneric"))
[1] "myGeneric"
12 / 25

Signature

signature allows you to control the arguments that are used for method dispatch. If signature is not supplied, all arguments (apart from ...) are used

setGeneric("myGeneric",
function(x, ..., verbose = TRUE) standardGeneric("myGeneric"),
signature = "x"
)
[1] "myGeneric"
13 / 25

Methods

A generic isn't useful without some methods, and in S4 you define methods with setMethod(). There are three important arguments: the name of the generic, the name of the class, and the method itself.

setMethod("myGeneric", "Person", function(x) {
# method implementation
})

More formally, the second argument to setMethod() is called the signature. In S4, unlike S3, the signature can include multiple arguments. This makes method dispatch in S4 substantially more complicated.

14 / 25

To list all the methods that belong to a generic, or that are associated with a class, use methods("generic") or methods(class = "class"); to find the implementation of a specific method, use selectMethod("generic", "class"). You can get the arguments by looking at the args() of the generic

methods("myGeneric")
methods(class = "Person")
selectMethod("myGeneric", "Person")
args(getGeneric("myGeneric"))
[1] myGeneric,Person-method
see '?methods' for accessing help and source code
[1] age age<- myGeneric
see '?methods' for accessing help and source code
Method Definition:
function (x, ..., verbose = TRUE)
{
.local <- function (x)
{
}
.local(x, ...)
}
Signatures:
x
target "Person"
defined "Person"
function (x, ..., verbose = TRUE)
NULL
15 / 25

The show method for the Person class needs to have a single argument object:

setMethod("show", "Person", function(object) {
cat(is(object)[[1]], "\n",
" Name: ", object@name, "\n",
" Age: ", object@age, "\n",
sep = ""
)
})
nostradamus
16 / 25

Method dispatch

S4 dispatch is complicated because S4 has two important features:

  • Multiple inheritance, i.e. a class can have multiple parents,
  • Multiple dispatch, i.e. a generic can use multiple arguments to pick a method.

These features make S4 very powerful, but can also make it hard to understand which method will get selected for a given combination of inputs. In practice, keep method dispatch as simple as possible by avoiding multiple inheritance, and reserving multiple dispatch only for where it is absolutely necessary.

Hadley uses a cool concept to illustrate this - an imaginary class graph based on emoji:

17 / 25

Single dispatch

Let's start with the simplest case: a generic function that dispatches on a single class with a single parent. The method dispatch here is simple so it's a good place to define the graphical conventions we'll use for the more complex cases.

There are two parts to this diagram:

  • The top part, f(...), defines the scope of the diagram. Here we have a generic with one argument, that has a class hierarchy that is three levels deep.

  • The bottom part is the method graph and displays all the possible methods that could be defined. Methods that exist, i.e. that have been defined with setMethod(), have a grey background.

18 / 25

Multiple Inheritance

S4 dispatch is complicated because S4 has two important features:

  • Multiple inheritance, i.e. a class can have multiple parents,
  • Multiple dispatch, i.e. a generic can use multiple arguments to pick a method.

These features make S4 very powerful, but can also make it hard to understand which method will get selected for a given combination of inputs. In practice, keep method dispatch as simple as possible by avoiding multiple inheritance, and reserving multiple dispatch only for where it is absolutely necessary.

19 / 25

Things get more complicated when the class has multiple parents.

The basic process remains the same: you start from the actual class supplied to the generic, then follow the arrows until you find a defined method. The wrinkle is that now there are multiple arrows to follow, so you might find multiple methods. If that happens, you pick the method that is closest, i.e. requires travelling the fewest arrows.

20 / 25

If no method can be found it will be highlighted with a red double outline. What happens if methods are the same distance - an ambiguous method? An ambiguous method will be illustrated with a thick dotted border

21 / 25

With multiple inheritances it is hard to simultaneously prevent ambiguity, ensure that every terminal method has an implementation, and minimise the number of defined methods (in order to benefit from OOP). For example, of the six ways to define only two methods for this call, only one is free from problems.

22 / 25

Multiple dispatch

After multiple inheritance, understanding multiple dispatch is straightforward. You follow multiple arrows in the same way as previously, but now each method is specified by two classes (separated by a comma).

23 / 25

The main difference between multiple inheritance and multiple dispatch is that there are many more arrows to follow. The following diagram shows four defined methods which produce two ambiguous cases:

Multiple dispatch tends to be less tricky to work with than multiple inheritance because there are usually fewer terminal class combinations. In this example, there's only one. That means, at a minimum, you can define a single method and have default behaviour for all inputs.

24 / 25
25 / 25

What are S4 classes?

S4 provides a formal approach to functional OOP. An important new component of S4 is the slot, a named component of the object that is accessed using the specialised subsetting operator @.

2 / 25
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow