S3

The S3 class is R’s first and simplest functional OOP system

Learning Objectives

  1. Recognize S3 objects and classes.
  2. Assign S3 classes to objects.
  3. Create S3 generics and methods.
  4. Implement advanced features of S3.

What is an S3 object?

S3 objects have a class attribute

An S3 object has an attribute called “class” with at least one value.

f <- factor(c("a", "b", "c"))
attributes(f)
#> $levels
#> [1] "a" "b" "c"
#> 
#> $class
#> [1] "factor"

S3 generics provide an interface to a particular behavior

  • Generic function (or “generic”): Function that can be customized for different object classes.
  • print() is an S3 generic for displaying objects depending on their class attribute.

print.factor() is a print() method for factors

Method: A function customized for a particular class.

print.factor # or sloop::s3_get_method("print.factor")
#> function (x, quote = FALSE, max.levels = NULL, width = getOption("width"), 
#>     ...) 
#> {
#>     ord <- is.ordered(x)
#>     if (length(x) == 0L) 
#>         cat(if (ord) 
#>             "ordered"
#>         else "factor", "()\n", sep = "")
#>     else {
#>         xx <- character(length(x))
#>         xx[] <- as.character(x)
#>         keepAttrs <- setdiff(names(attributes(x)), c("levels", 
#>             "class"))
#>         attributes(xx)[keepAttrs] <- attributes(x)[keepAttrs]
#>         print(xx, quote = quote, ...)
#>     }
#>     maxl <- max.levels %||% TRUE
#>     if (maxl) {
#>         n <- length(lev <- encodeString(levels(x), quote = ifelse(quote, 
#>             "\"", "")))
#>         colsep <- if (ord) 
#>             " < "
#>         else " "
#>         T0 <- "Levels: "
#>         if (is.logical(maxl)) 
#>             maxl <- {
#>                 width <- width - (nchar(T0, "w") + 3L + 1L + 
#>                   3L)
#>                 lenl <- cumsum(nchar(lev, "w") + nchar(colsep, 
#>                   "w"))
#>                 if (n <= 1L || lenl[n] <= width) 
#>                   n
#>                 else max(1L, which.max(lenl > width) - 1L)
#>             }
#>         drop <- n > maxl
#>         cat(if (drop) 
#>             paste(format(n), ""), T0, paste(if (drop) 
#>             c(lev[1L:max(1, maxl - 1)], "...", if (maxl > 1) lev[n])
#>         else lev, collapse = colsep), "\n", sep = "")
#>     }
#>     if (!isTRUE(val <- .valid.factor(x))) 
#>         warning(val)
#>     invisible(x)
#> }
#> <bytecode: 0x0000025824b39540>
#> <environment: namespace:base>

How does one create an S3 class?

Assign an S3 class to a new object with structure() or class()<-

x <- structure(list(), class = "my_class")

x <- list()
class(x) <- "my_class"
attributes(x)
#> $class
#> [1] "my_class"

But currently, this is pretty useless…

Create new S3 classes with 3 functions

  1. A low-level constructor, new_myclass(), that efficiently creates new objects with the correct structure.

  2. A validator, validate_myclass(), that performs more computationally expensive checks to ensure that the object has correct values.

  3. A user-friendly helper, myclass(), that provides a convenient way for others to create objects of your class.

1. new_factor() creates the new S3 object

The developer-facing function applies the correct structure for the object.

new_factor <- function(x = integer(), levels = character()) {
  stopifnot(is.integer(x))
  stopifnot(is.character(levels))

  structure(
    x,
    levels = levels,
    class = "factor"
  )
}

new_factor(1:3, c("a","b","c"))
#> [1] a b c
#> Levels: a b c

2. validate_factor() provides assurances on correctness

Though computationally expensive, the checks ensure correct values before creating the S3 object.

validate_factor <- function(x) {
  values <- unclass(x)
  levels <- attr(x, "levels")

  if (!all(!is.na(values) & values > 0)) {
    stop(
      "All `x` values must be non-missing and greater than zero",
      call. = FALSE
    )
  }

  if (length(levels) < max(values)) {
    stop(
      "There must be at least as many `levels` as possible values in `x`",
      call. = FALSE
    )
  }

  x
}

validate_factor(new_factor(1:5, "a"))
#> Error: There must be at least as many `levels` as possible values in `x`
validate_factor(new_factor(0:1, "a"))
#> Error: All `x` values must be non-missing and greater than zero

3. factor() provides a safe approach for users to create objects

The user-facing validates the values before creating the S3 object.

factor <- function(x = character(), levels = unique(x)) {
  ind <- match(x, levels)
  new_factor(ind, levels) |> 
    validate_factor()
}

factor(c("a", "a", "b"))
#> [1] a a b
#> Levels: a b

How do generics perform dispatching?

summary() is an S3 generic that behaves based on an object’s class

sloop::is_s3_generic("summary")
#> [1] TRUE
summary
#> function (object, ...) 
#> UseMethod("summary")
#> <bytecode: 0x0000025822139b98>
#> <environment: namespace:base>
sloop::s3_methods_generic(
  "summary"
) |> 
  knitr::kable()
generic class visible source
summary aov TRUE stats
summary aovlist FALSE registered S3method
summary aspell FALSE registered S3method
summary check_packages_in_dir FALSE registered S3method
summary connection TRUE base
summary data.frame TRUE base
summary Date TRUE base
summary default TRUE base
summary difftime TRUE base
summary ecdf FALSE registered S3method
summary factor TRUE base
summary glm TRUE stats
summary infl FALSE registered S3method
summary lm TRUE stats
summary loess FALSE registered S3method
summary manova TRUE stats
summary matrix TRUE base
summary mlm FALSE registered S3method
summary nls FALSE registered S3method
summary packageStatus FALSE registered S3method
summary POSIXct TRUE base
summary POSIXlt TRUE base
summary ppr FALSE registered S3method
summary prcomp FALSE registered S3method
summary princomp FALSE registered S3method
summary proc_time TRUE base
summary rlang:::list_of_conditions FALSE registered S3method
summary rlang_error FALSE registered S3method
summary rlang_message FALSE registered S3method
summary rlang_trace FALSE registered S3method
summary rlang_warning FALSE registered S3method
summary srcfile TRUE base
summary srcref TRUE base
summary stepfun TRUE stats
summary stl FALSE registered S3method
summary table TRUE base
summary tukeysmooth FALSE registered S3method
summary vctrs_sclr FALSE registered S3method
summary vctrs_vctr FALSE registered S3method
summary warnings TRUE base

summary() is an interface to different methods

  • Polymorphism: a single interface to different behaviors.

  • However, summary() does not have a consistent output class.

a <- iris$Sepal.Length
class(a)
#> [1] "numeric"
(a_summary <- summary(a))
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   4.300   5.100   5.800   5.843   6.400   7.900
sloop::s3_class(a_summary)
#> [1] "summaryDefault" "table"
b <- iris$Species
class(b)
#> [1] "factor"
(b_summary <- summary(b))
#>     setosa versicolor  virginica 
#>         50         50         50
sloop::s3_class(b_summary)
#> [1] "integer" "numeric"

The custom summarizer() S3 generic outputs a consistent object

  • The goal is to summarize characters, factors, and numeric objects
  • The output should always be a length one character object

Create a new S3 generic following two rules

  1. Only write a method if you own the generic. Otherwise, it is considdered bad manners.
  2. The method must have the same arguments as its generic–with one important exception: ...

summarizer() includes UseMethod for dispatching

summarizer <- function(x, probs = c(0.25, 0.5, 0.75)) {
  UseMethod("summarizer")
}

Caution

You don’t pass any of the arguments of the generic to UseMethod(); it uses deep magic to pass to the method automatically. The precise process is complicated and frequently surprising, so you should avoid doing any computation in a generic. See ?UseMethod for details.

summarizer() outputs consistent summary strings

Encapsulation: Bundle data and methods into a single object or “unit”.

summarizer.character <- function(x, probs = c(0.25, 0.5, 0.75)) {
  tab <- sort(table(x), decreasing = TRUE)
  # Format as "value (n)"
  pairs <- paste0(names(tab), " (", tab, ")")
  paste(pairs, collapse = ", ")
}
summarizer.numeric <- function(x, probs = c(0.25, 0.5, 0.75)) {
  # Remove NAs to avoid warnings
  x_no_na <- x[!is.na(x)]
  
  qs <- quantile(x_no_na, probs = probs, names = FALSE)
  names(qs) <- paste0("Q", seq_along(qs))

  stats <- c(
    Min = min(x_no_na),
    qs,
    Max = max(x_no_na)
  )
  
  paste0(
    paste(names(stats), format(stats), sep = ": ", collapse = ", ")
  )
}
summarizer.factor <- function(x, probs = c(0.25, 0.5, 0.75)) {
  # Just delegate to character method on the labels
  summarizer(as.character(x), probs = probs)
}
summarizer.default <- function(x, probs = c(0.25, 0.5, 0.75)) {
  cls <- paste(class(x), collapse = ", ")
  paste0("No summarizer method for class: ", cls)
}
summarizer(c("apple", "apple", "pear", "pear", "pear", "orange"))
#> [1] "pear (3), apple (2), orange (1)"
summarizer(factor(c("a","b","b","a","c","c","c")))
#> [1] "c (3), a (2), b (2)"
summarizer(c(1,2,3,4,5,6,10))
#> [1] "Min:  1.0, Q1:  2.5, Q2:  4.0, Q3:  5.5, Max: 10.0"
summarizer(c(1,50,100,250), probs = c(.1, .9))
#> [1] "Min:   1.0, Q1:  15.7, Q2: 205.0, Max: 250.0"
iris |> 
  vapply(
    summarizer,
    character(1)
  )
#>                                         Sepal.Length 
#>      "Min: 4.3, Q1: 5.1, Q2: 5.8, Q3: 6.4, Max: 7.9" 
#>                                          Sepal.Width 
#>      "Min: 2.0, Q1: 2.8, Q2: 3.0, Q3: 3.3, Max: 4.4" 
#>                                         Petal.Length 
#> "Min: 1.00, Q1: 1.60, Q2: 4.35, Q3: 5.10, Max: 6.90" 
#>                                          Petal.Width 
#>      "Min: 0.1, Q1: 0.3, Q2: 1.3, Q3: 1.8, Max: 2.5" 
#>                                              Species 
#>       "setosa (50), versicolor (50), virginica (50)"
survival::lung |> 
  vapply(
    summarizer,
    character(1)
  )
#>                                                                inst 
#>                          "Min:  1, Q1:  3, Q2: 11, Q3: 16, Max: 33" 
#>                                                                time 
#> "Min:    5.00, Q1:  166.75, Q2:  255.50, Q3:  396.50, Max: 1022.00" 
#>                                                              status 
#>                               "Min: 1, Q1: 1, Q2: 2, Q3: 2, Max: 2" 
#>                                                                 age 
#>                          "Min: 39, Q1: 56, Q2: 63, Q3: 69, Max: 82" 
#>                                                                 sex 
#>                               "Min: 1, Q1: 1, Q2: 1, Q3: 2, Max: 2" 
#>                                                             ph.ecog 
#>                               "Min: 0, Q1: 0, Q2: 1, Q3: 1, Max: 3" 
#>                                                            ph.karno 
#>                     "Min:  50, Q1:  75, Q2:  80, Q3:  90, Max: 100" 
#>                                                           pat.karno 
#>                     "Min:  30, Q1:  70, Q2:  80, Q3:  90, Max: 100" 
#>                                                            meal.cal 
#>                "Min:   96, Q1:  635, Q2:  975, Q3: 1150, Max: 2600" 
#>                                                             wt.loss 
#>      "Min: -24.00, Q1:   0.00, Q2:   7.00, Q3:  15.75, Max:  68.00"

The next method is used for missing summarizer() generics

Inheritance: method dispatching through the “class” attribute.

a <- iris$Species
summarizer(a)
#> [1] "setosa (50), versicolor (50), virginica (50)"
(class(a) <- c("my_class",class(a)))
#> [1] "my_class" "factor"
summarizer(a)
#> [1] "setosa (50), versicolor (50), virginica (50)"

What are advanced implementations of S3 classes?

S3 classes can conceal data elements from length() for concise representation

  • Record style objects use a list of equal-length vectors to represent individual components of the object.
  • The best example of this is POSIXlt, which underneath the hood is a list of 11 date-time components like year, month, and day.
  • Record style classes override length() and subsetting methods to conceal this implementation detail.
x <- as.POSIXlt(ISOdatetime(2020, 1, 1, 0, 0, 1:3))
x
#> [1] "2020-01-01 00:00:01 CST" "2020-01-01 00:00:02 CST"
#> [3] "2020-01-01 00:00:03 CST"
length(x)
#> [1] 3
length(unclass(x))
#> [1] 11
x[[1]] # the first date time
#> [1] "2020-01-01 00:00:01 CST"
unclass(x)[[1]] # the first component, the number of seconds
#> [1] 1 2 3

NextMethod()delegates dispatch behavior for classes without explicit generics

secret() masks each character of the input with x in the output

new_secret <- function(x = double()) {
  stopifnot(is.double(x))
  structure(x, class = "secret")
}

print.secret <- function(x, ...) {
  print(strrep("x", nchar(x)))
  invisible(x)
}

y <- new_secret(c(15, 1, 456))
y
#> [1] "xx"  "x"   "xxx"

There is unexpected behavior without a [ generic for the secret class

  • The [ method is problematic in that it does not preserve the secret class.
  • y[1] returns 15 as the first element instead of xx.
sloop::s3_dispatch(y[1])
#>    [.secret
#>    [.default
#> => [ (internal)
y[1]
#> [1] 15

Fix this with a [.secret method:

NextMethod() is an efficient way to subset a secret class

`[.secret` <- function(x, i) {
  # first, dispatch to `[`
  # then, coerce subset value to `secret` class
  new_secret(NextMethod())
}
  • [.secret is selected, but delegates to internal [.
  • Makes secret object for that element alone
sloop::s3_dispatch(y[1])
#> => [.secret
#>    [.default
#> -> [ (internal)
y[1]
#> [1] "xx"
  • Includes copying and infinite loop
# not run
`[.secret` <- function(x, i) {
  x <- unclass(x)
  new_secret(x[i])
}

S3 classes can inherit generics from the superclass

  • Note: You should ‘own’ the superclass constructor
new_secret <- function(x, ..., class = character()) {
  stopifnot(is.double(x))

  structure(
    x,
    ...,
    class = c(class, "secret")
  )
}
  • Simply invoke the superclass constructor inside of the subclass constructor:
new_supersecret <- function(x) {
  new_secret(x, class = "supersecret")
}

print.supersecret <- function(x, ...) {
  print(rep("xxxxx", length(x)))
  invisible(x)
}

x2 <- new_supersecret(c(15, 1, 456))
x2
#> [1] "xxxxx" "xxxxx" "xxxxx"
  • No way in base R to properly delegate methods in inheritance chain
  • vctrs::vec_restore provides proper method chaining
vec_restore.secret <- function(x, to, ...) new_secret(x)
vec_restore.supersecret <- function(x, to, ...) new_supersecret(x)
`[.secret` <- function(x, ...) {
  vctrs::vec_restore(NextMethod(), x)
}
x2[1:3]
#> [1] "xxxxx" "xxxxx" "xxxxx"