Chapter 15 - S4 part1

15.1 Introduction

  • similar to S3
  • stricter implementation
  • specialised functions for:
    • creating classes: setClass()
    • generics: setGeneric()
    • methods: setMethod()
  • multiple inheritance (i.e. a class can have multiple parents)
  • multiple dispatch (i.e. method dispatch can use the class of multiple arguments)

S4 has a special component: slot. It’s accessed using the subsetting operator @

15.2 Basics

Definition of a class:

  • Name of the class
  • definition of slots
  • names & classes of class data
setClass("Person", 
  slots = c(
    name = "character", 
    age = "numeric"
  )
)

Construct new objects with new()

Bob <- new("Person", name = "Bob Marley", age = NA_real_)

Given an S4 object you can see its class with is() and access slots with @ (equivalent to $) and slot() (equivalent to [[):

is(Bob)
## [1] "Person"
Bob@name
## [1] "Bob Marley"
slot(Bob, "age")
## [1] NA

Accessor functions:

  1. Create generics with setGeneric()
setGeneric("age", function(x) standardGeneric("age"))
## [1] "age"
setGeneric("age<-", function(x, value) standardGeneric("age<-"))
## [1] "age<-"
  1. Define methods with setMethod()
setMethod("age", "Person", function(x) x@age)
setMethod("age<-", "Person", function(x, value) {
  x@age <- value
  x
})

age(Bob) <- 36
age(Bob)
## [1] 36

To identify objects and functions we can use sloop

sloop::otype(Bob)
## [1] "S4"
sloop::ftype(age)
## [1] "S4"      "generic"

Exercises

  1. lubridate::period() returns an S4 class. What slots does it have? What class is each slot? What accessors does it provide?

  1. lubridate::period() returns an S4 class. What slots does it have? What class is each slot? What accessors does it provide?
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
example <- lubridate::period()
class?Period

is(example)
## [1] "Period"   "Timespan" "numeric"  "vector"

What slots does it have? What class is each slot?

Period class objects have six slots.

  1. .Data, a numeric object. The apparent amount of seconds to add to the period.
  2. minute, a numeric object. The apparent amount of minutes to add to the period.
  3. hour, a numeric object. The apparent amount of hours to add to the period.
  4. day, a numeric object. The apparent amount of days to add to the period.
  5. month, a numeric object. The apparent amount of months to add to the period.
  6. year, a numeric object. The apparent amount of years to add to the period.
getClass("Period")
## Class "Period" [package "lubridate"]
## 
## Slots:
##                                                       
## Name:    .Data    year   month     day    hour  minute
## Class: numeric numeric numeric numeric numeric numeric
## 
## Extends: 
## Class "Timespan", directly
## Class "numeric", from data part
## Class "vector", by class "numeric", distance 2

What accessors does it provide?

year(example)
## numeric(0)
month(example)
## numeric(0)
day(example)
## numeric(0)
hour(example)
## numeric(0)
minute(example)
## numeric(0)
second(example)
## numeric(0)

  1. What other ways can you find help for a method? Read ?“?” and summarise the details.

  1. What other ways can you find help for a method? Read ?“?” and summarise the details.
## define a S4 generic function and some methods
combo <- function(x, y) c(x, y)
setGeneric("combo")
setMethod("combo", c("numeric", "numeric"), function(x, y) x+y)

## assume we have written some documentation
## for combo, and its methods ....

?combo  # produces the function documentation

methods?combo  # looks for the overall methods documentation

method?combo("numeric", "numeric")  # documentation for the method above

?combo(1:10, rnorm(10))  # ... the same method, selected according to
                         # the arguments (one integer, the other numeric)

?combo(1:10, letters)    # documentation for the default method

15.3 Classes

To define an S4 class, call setClass() with three arguments:

  • class name (CamelCase)
  • named character vector for the slots
  • prototype (list of default values for each slot)
setClass("Person", 
  slots = c(
    name = "character", 
    age = "numeric"
  ), 
  prototype = list(
    name = NA_character_,
    age = NA_real_
  )
)

me <- new("Person", name = "Anne")
str(me)
## Formal class 'Person' [package ".GlobalEnv"] with 2 slots
##   ..@ name: chr "Anne"
##   ..@ age : num NA

15.3.1 Inheritance

contains can be used as an argument to setClass() to specify if any behaviour is inherited from another class.

setClass("Employee", 
  contains = "Person", 
  slots = c(
    boss = "Person"
  ),
  prototype = list(
    boss = new("Person")
  )
)

str(new("Employee"))
## Formal class 'Employee' [package ".GlobalEnv"] with 3 slots
##   ..@ boss:Formal class 'Person' [package ".GlobalEnv"] with 2 slots
##   .. .. ..@ name: chr NA
##   .. .. ..@ age : num NA
##   ..@ name: chr NA
##   ..@ age : num NA

15.3.2 Introspection

To check inheritance of objects, use is()

is(new("Person"))
## [1] "Person"
is(new("Employee"))
## [1] "Employee" "Person"
is(me, "Person")
## [1] TRUE

15.3.3 Redefinition

setClass("A", slots = c(x = "numeric"))
a <- new("A", x = 10)

setClass("A", slots = c(a_different_slot = "numeric"))
a
## An object of class "A"
## Slot "a_different_slot":
## Error in slot(object, what): no slot of name "a_different_slot" for this object of class "A"

This can cause confusion during interactive creation of new classes.

15.3.4 Helper

A helper should always:

  • Have the same name as the class, e.g. myclass()
  • Have a thoughtfully crafted user interface with carefully chosen default values and useful conversions
  • Create carefully crafted error messages tailored towards an end-user
  • Finish by calling methods::new()
Person <- function(name, age = NA) {
  age <- as.double(age)
  
  new("Person", name = name, age = age)
}

Person("Anne")
## An object of class "Person"
## Slot "name":
## [1] "Anne"
## 
## Slot "age":
## [1] NA

15.3.5 Validator

The constructor automatically checks that the slots have correct classes:

Person(name=3, age=T)
## Error in validObject(.Object): invalid class "Person" object: invalid object for slot "name" in class "Person": got class "numeric", should be or extend class "character"

For checks like length of slots, one will have to write a validator with setValidity():

Person("Anne", age = c(32, 25))
## An object of class "Person"
## Slot "name":
## [1] "Anne"
## 
## Slot "age":
## [1] 32 25
setValidity("Person", function(object) {
  if (length(object@name) != length(object@age)) {
    "@name and @age must be same length"
  } else {
    TRUE
  }
})
## Class "Person" [in ".GlobalEnv"]
## 
## Slots:
##                           
## Name:       name       age
## Class: character   numeric
## 
## Known Subclasses: "Employee"
Person("Anne", age = c(32, 25))
## Error in validObject(.Object): invalid class "Person" object: @name and @age must be same length

The validity check is only performed when you create a new object. You can still modify it after creation, and generate invalid input.

alex <- Person("Alex", age = 30)
alex@age <- 1:10
alex
## An object of class "Person"
## Slot "name":
## [1] "Alex"
## 
## Slot "age":
##  [1]  1  2  3  4  5  6  7  8  9 10
validObject(alex)
## Error in validObject(alex): invalid class "Person" object: @name and @age must be same length

15.3.6 Exercises

  1. Extend the Person class with fields to match utils::person(). Think about what slots you will need, what class each slot should have, and what you’ll need to check in your validity method.

  1. Extend the Person class with fields to match utils::person(). Think about what slots you will need, what class each slot should have, and what you’ll need to check in your validity method.

utils::person(given = NULL, family = NULL, middle = NULL, email = NULL, role = NULL, comment = NULL, first = NULL, last = NULL)

setClass("Person", 
  slots = c(
    given = "character", 
    family = "character",
    email = "character",
    role = "character",
    comment = "character"
  ), 
  prototype = list(
    given = NA_character_,
    family = NA_character_,
    email = NA_character_,
    role = NA_character_,
    comment = NA_character_
  )
)

setValidity("Person", function(object) {
  if (length(object@given) != length(object@family)) {
    "@given and @family must be same length"
  } else {
    TRUE
  }
  if(object@role %in% c("aut", "com", "cph", "cre", "ctb", "ctr", "dtc", "fnd", "rev", "ths", "trl", NA_character_)){
    TRUE
  }else{
    "@role must be one of \"aut\", \"com\", \"cph\", \"cre\", \"ctb\", \"ctr\", \"dtc\", \"fnd\", \"rev\", \"ths\", \"trl\""
  }
})

Person <- function(given, family, email = NA_character_, role = NA_character_, comment = NA_character_) {
  new("Person", given = given, family = family, email = email, role = role, comment = comment)
}

Person("Anne", "Hoffrichter")
Person("Anne")

  1. What happens if you define a new S4 class that doesn’t have any slots? (Hint: read about virtual classes in ?setClass.)

  1. What happens if you define a new S4 class that doesn’t have any slots? (Hint: read about virtual classes in ?setClass.)
?setClass

If the class is virtual, an attempt to generate an object from either the generator or new() will result in an error.

Classes exist for which no actual objects can be created, the virtual classes.

The most common and useful form of virtual class is the class union, a virtual class that is defined in a call to setClassUnion() rather than a call to setClass(). This call lists the members of the union—subclasses that extend the new class. Methods that are written with the class union in the signature are eligible for use with objects from any of the member classes. Class unions can include as members classes whose definition is otherwise sealed, including basic R data types. (A class may be defined as the union of other classes; that is, as a virtual class defined as a superclass of several other classes. Class unions are useful in method signatures or as slots in other classes, when we want to allow one of several classes to be supplied.)

Calls to setClass() will also create a virtual class, either when only the Class argument is supplied (no slots or superclasses) or when the contains= argument includes the special class name "VIRTUAL".

In the latter case, a virtual class may include slots to provide some common behavior without fully defining the object—see the class traceable for an example. Note that "VIRTUAL" does not carry over to subclasses; a class that contains a virtual class is not itself automatically virtual.

setClass("A", representation("VIRTUAL")) 
showClass("A")
## Virtual Class "A" [in ".GlobalEnv"]
## 
## No Slots, prototype of class "S4"
suppressPackageStartupMessages(library(GenomicRanges))
showClass("GenomicRanges")
## Virtual Class "GenomicRanges" [package "GenomicRanges"]
## 
## Slots:
##                                                       
## Name:  elementMetadata     elementType        metadata
## Class:       DataFrame       character            list
## 
## Extends: 
## Class "Ranges", directly
## Class "GenomicRanges_OR_missing", directly
## Class "GenomicRanges_OR_GenomicRangesList", directly
## Class "GenomicRanges_OR_GRangesList", directly
## Class "List", by class "Ranges", distance 2
## Class "Vector", by class "Ranges", distance 3
## Class "list_OR_List", by class "Ranges", distance 3
## Class "Annotated", by class "Ranges", distance 4
## Class "vector_OR_Vector", by class "Ranges", distance 4
## 
## Known Subclasses: 
## Class "GenomicPos", directly
## Class "GRanges", directly
## Class "DelegatingGenomicRanges", directly
## Class "GNCList", directly
## Class "GPos", by class "GenomicPos", distance 2
## Class "GPos", by class "GRanges", distance 2
## Class "UnstitchedGPos", by class "GenomicPos", distance 3
## Class "StitchedGPos", by class "GenomicPos", distance 3
## Class "UnstitchedGPos", by class "GRanges", distance 3
## Class "StitchedGPos", by class "GRanges", distance 3
showClass("GenomicRanges_OR_GRangesList")
## Virtual Class "GenomicRanges_OR_GRangesList" [package "GenomicRanges"]
## 
## No Slots, prototype of class "S4"
## 
## Known Subclasses: 
## Class "GenomicRanges", directly
## Class "GRangesList", directly
## Class "GenomicPos", by class "GenomicRanges", distance 2
## Class "GRanges", by class "GenomicRanges", distance 2
## Class "DelegatingGenomicRanges", by class "GenomicRanges", distance 2
## Class "GNCList", by class "GenomicRanges", distance 2
## Class "SimpleGRangesList", by class "GRangesList", distance 2
## Class "CompressedGRangesList", by class "GRangesList", distance 2
## Class "GPos", by class "GenomicRanges", distance 3
## Class "GPos", by class "GenomicRanges", distance 3
## Class "UnstitchedGPos", by class "GenomicRanges", distance 4
## Class "StitchedGPos", by class "GenomicRanges", distance 4
## Class "UnstitchedGPos", by class "GenomicRanges", distance 4
## Class "StitchedGPos", by class "GenomicRanges", distance 4

  1. Imagine you were going to reimplement factors, dates, and data frames in S4. Sketch out the setClass() calls that you would use to define the classes. Think about appropriate slots and prototype.

  1. Imagine you were going to reimplement factors, dates, and data frames in S4. Sketch out the setClass() calls that you would use to define the classes. Think about appropriate slots and prototype.
factor(x = character(), levels, labels = levels, 
  exclude = NA, ordered = is.ordered(x), nmax = NA)
setClass("NewFactor",
         slots=c(x="character",
                 levels="integer"
           ),
         prototype = list(
           x=NA_character_,
           levels=seq_along(x)
         )
         )