S4 provides a formal approach to functional OOP. An important new component of S4 is the slot, a named component of the object that is accessed using the specialised subsetting operator @
.
The salient features of S4 are as follows.
methods Package: All functions related to S4 live in the methods package. This package is always available when you're running R interactively
methods::new()
, methods::setClass()
, methods::setGeneric()
, methods::setMethod()
Accessor Functions: Enable you to safely get/set slot values
methods::setGeneric()
, method::getGeneric()
Class Defintion: Defintion of class using three arguments
class name
Named character vector with names and classes of slots: c(name = "character", age = "numeric")
prototype with list of default values for each slot
Creation of class, creations of objects of that class, determine class type, acess slots
setClass("Person", slots = c( name = "character", age = "numeric" ))nostradamus <- new("Person", name = "Nostradamus", age = NA_real_)
Necessity of accessor functions
is(nostradamus)nostradamus@nameslot(nostradamus, "age")
[1] "Person"[1] "Nostradamus"[1] NA
Creating setter/getter functions by creating generics with setGeneric()
:
setGeneric("age", function(x) standardGeneric("age"))setGeneric("age<-", function(x, value) standardGeneric("age<-"))
[1] "age"[1] "age<-"
And then defining methods with setMethod()
:
setMethod("age", "Person", function(x) x@age)setMethod("age<-", "Person", function(x, value) { x@age <- value x})age(nostradamus) <- 50age(nostradamus)
[1] 50
The class name
A named character vector that describes the names and classes of the slots (fields)
A prototype, a list of default values for each slot. Optional, but better to do it
setClass("Person", slots = c( name = "character", age = "numeric" ), prototype = list( name = NA_character_, age = NA_real_ ))me <- new("Person", name = "Nostradamus")str(me)
Formal class 'Person' [package ".GlobalEnv"] with 2 slots ..@ name: chr "Nostradamus" ..@ age : num NA
There is one other important argument to setClass()
: contains
. This specifies a class (or classes) to inherit slots and behaviour from. For example, we can create an Employee
class that inherits from the Person
class, adding an extra slot that describes their boss
.
setClass("Employee", contains = "Person", slots = c( boss = "Person" ), prototype = list( boss = new("Person") ))str(new("Employee"))
Show in New WindowClear OutputExpand/Collapse Output[1] 50Show in New WindowClear OutputExpand/Collapse OutputFormal class 'Person' [package ".GlobalEnv"] with 2 slots ..@ name: chr "Nostradamus" ..@ age : num NAShow in New WindowClear OutputExpand/Collapse OutputFormal class 'Employee' [package ".GlobalEnv"] with 3 slots ..@ boss:Formal class 'Person' [package ".GlobalEnv"] with 2 slots .. .. ..@ name: chr NA .. .. ..@ age : num NA ..@ name: chr NA ..@ age : num NA
To determine what classes an object inherits from, use is()
:
is(new("Person"))is(new("Employee"))
[1] "Person"[1] "Employee" "Person"
To test if an object inherits from a specific class, use the second argument of is()
:
is(nostradamus, "person")
[1] FALSE
new()
is a low-level constructor for use by the developer. User-facing classes should always be paired with a user-friendly helper. A helper should always:
Have the same name as the class, e.g. Person()
.
Finish by calling methods::new()
.
Person <- function(name, age = NA) { age <- as.double(age) new("Person", name = name, age = age)}Person("Nostradamus")
An object of class "Person"Slot "name":[1] "Nostradamus"Slot "age":[1] NA
The constructor automatically checks that the slots have correct classes. However, we might want to test that all slots have the same length as we want to store info about multiple people
Person("Nostradamus", age = c(30, 37))
An object of class "Person"Slot "name":[1] "Nostradamus"Slot "age":[1] 30 37
To enforce these additional constraints we write a validator with setValidity()
. It takes a class and a function that returns TRUE
if the input is valid, and otherwise returns a character vector describing the problem(s):
setValidity("Person", function(object) { if (length(object@name) != length(object@age)) { "@name and @age must be same length" } else { TRUE }})Person("Nostradamus", age = c(30, 37))
Class "Person" [in ".GlobalEnv"]Slots:Name: name ageClass: character numericKnown Subclasses: "Employee"Error in validObject(.Object) : invalid class “Person” object: @name and @age must be same length
The job of a generic is to perform method dispatch, i.e. find the specific implementation for the combination of classes passed to the generic. To create a new S4 generic, call setGeneric()
with a function that calls standardGeneric()
:
setGeneric("myGeneric", function(x) standardGeneric("myGeneric"))
[1] "myGeneric"
signature
allows you to control the arguments that are used for method dispatch. If signature
is not supplied, all arguments (apart from ...
) are used
setGeneric("myGeneric", function(x, ..., verbose = TRUE) standardGeneric("myGeneric"), signature = "x")
[1] "myGeneric"
A generic isn't useful without some methods, and in S4 you define methods with setMethod()
. There are three important arguments: the name of the generic, the name of the class, and the method itself.
setMethod("myGeneric", "Person", function(x) { # method implementation})
More formally, the second argument to setMethod()
is called the signature. In S4, unlike S3, the signature can include multiple arguments. This makes method dispatch in S4 substantially more complicated.
To list all the methods that belong to a generic, or that are associated with a class, use methods("generic")
or methods(class = "class")
; to find the implementation of a specific method, use selectMethod("generic", "class")
. You can get the arguments by looking at the args()
of the generic
methods("myGeneric")methods(class = "Person")selectMethod("myGeneric", "Person")args(getGeneric("myGeneric"))
[1] myGeneric,Person-methodsee '?methods' for accessing help and source code[1] age age<- myGenericsee '?methods' for accessing help and source codeMethod Definition:function (x, ..., verbose = TRUE) { .local <- function (x) { } .local(x, ...)}Signatures: x target "Person"defined "Person"function (x, ..., verbose = TRUE) NULL
The show method for the Person class needs to have a single argument object
:
setMethod("show", "Person", function(object) { cat(is(object)[[1]], "\n", " Name: ", object@name, "\n", " Age: ", object@age, "\n", sep = "" )})nostradamus
S4 dispatch is complicated because S4 has two important features:
These features make S4 very powerful, but can also make it hard to understand which method will get selected for a given combination of inputs. In practice, keep method dispatch as simple as possible by avoiding multiple inheritance, and reserving multiple dispatch only for where it is absolutely necessary.
Hadley uses a cool concept to illustrate this - an imaginary class graph based on emoji:
Let's start with the simplest case: a generic function that dispatches on a single class with a single parent. The method dispatch here is simple so it's a good place to define the graphical conventions we'll use for the more complex cases.
There are two parts to this diagram:
The top part, f(...)
, defines the scope of the diagram. Here we have a
generic with one argument, that has a class hierarchy that is three levels
deep.
The bottom part is the method graph and displays all the possible methods
that could be defined. Methods that exist, i.e. that have been defined with
setMethod()
, have a grey background.
S4 dispatch is complicated because S4 has two important features:
These features make S4 very powerful, but can also make it hard to understand which method will get selected for a given combination of inputs. In practice, keep method dispatch as simple as possible by avoiding multiple inheritance, and reserving multiple dispatch only for where it is absolutely necessary.
Things get more complicated when the class has multiple parents.
The basic process remains the same: you start from the actual class supplied to the generic, then follow the arrows until you find a defined method. The wrinkle is that now there are multiple arrows to follow, so you might find multiple methods. If that happens, you pick the method that is closest, i.e. requires travelling the fewest arrows.
If no method can be found it will be highlighted with a red double outline. What happens if methods are the same distance - an ambiguous method? An ambiguous method will be illustrated with a thick dotted border
With multiple inheritances it is hard to simultaneously prevent ambiguity, ensure that every terminal method has an implementation, and minimise the number of defined methods (in order to benefit from OOP). For example, of the six ways to define only two methods for this call, only one is free from problems.
After multiple inheritance, understanding multiple dispatch is straightforward. You follow multiple arrows in the same way as previously, but now each method is specified by two classes (separated by a comma).
The main difference between multiple inheritance and multiple dispatch is that there are many more arrows to follow. The following diagram shows four defined methods which produce two ambiguous cases:
Multiple dispatch tends to be less tricky to work with than multiple inheritance because there are usually fewer terminal class combinations. In this example, there's only one. That means, at a minimum, you can define a single method and have default behaviour for all inputs.
S4 provides a formal approach to functional OOP. An important new component of S4 is the slot, a named component of the object that is accessed using the specialised subsetting operator @
.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |