Chapter 13 S3
13.2 Basics
What’s a real use case for unclass
?
I wanted to pass a paletteer palette to leaflet::colorNumeric
which ends up calling the S3 method leaflet:::toPaletteFunc
but that only has methods defined for character function and matrix . The paletteer output is a character vector with class color and so colorNumeric(palette, domain)
failed. Simple solution: colorNumeric(unclass(palette), domain)
13.3 Classes
Can we come up with a simple example for a constructor
, validator
, and helper
?
IPA Constructor
Create the custom class
IPA Validator
Function to make sure you can use your custom class
IPA Helper
Calls both the validator and the constructor
13.3.4.3 Exercises
Can we go over the factor source code, commenting it line for line?
# using the example:
x <- c("one", "two", "three", "one")
factor <- function (x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA) {
# if character vector is null set to character()
if (is.null(x))
x <- character()
# nx - names of x
# in our case NULL
nx <- names(x)
# we didn't supply levels so it's missing
if (missing(levels)) {
# set y to the unique names found in the vector
# nmax by default is NA but this is an argument
# for unique setting the max number of unique values
y <- unique(x, nmax = nmax)
# get the order of the unique values
# defaults to alphabetical
ind <- order(y)
# put our vector in order using only unique values
levels <- unique(as.character(y)[ind])
}
# because the function arg is a promise
# we need to eval now to use it? I think?
force(ordered) # FALSE
# if x isn't a character string make it one
if (!is.character(x))
x <- as.character(x)
# match the number of levels with NAs
# then if they are NA it will resolve to TRUE
# only return the TRUE levels
# this is how we get rid of NAs!
levels <- levels[is.na(match(levels, exclude))]
# get the ordered levels
f <- match(x, levels)
# if the names aren't null give f those names
if (!is.null(nx))
names(f) <- nx
# if missing levels make levels back into a character string?
# why?
if (missing(labels)) {
levels(f) <- as.character(levels)
}
# otherwise get the legenth of your labels
# if its the same as the levels
else {
nlab <- length(labels)
if (nlab == length(levels)) {
# xlevls is labels as character
# then we get the unique values of that
# and set that to xlevs
nlevs <- unique(xlevs <- as.character(labels))
# this is null
at <- attributes(f)
# attr levels
at$levels <- nlevs
# match the levels with ordered levels
# and put the levels back in numeric order?
# I think?
f <- match(xlevs, nlevs)[f]
# now set the attributes to the numeric levels
attributes(f) <- at
}
# if you only have one lavel
else if (nlab == 1L)
# add numerics to the end of the label?
levels(f) <- paste0(labels, seq_along(levels))
# otherwise invalid length
else stop(gettextf("invalid 'labels'; length %d should be 1 or %d",
nlab, length(levels)), domain = NA)
}
# add ordered and factor to f's class list
class(f) <- c(if (ordered) "ordered", "factor")
f
}
factor(x)
13.3.4.4 Exercises
Now that we’ve gone through the source code, I don’t see where the contrasts attibute comes into play? We can look at the C()
function but where is this used within factor()
?
It is a little misleading to say that factors “have an … attribute”, where I really think he means that “some functions might check if that attribute has been assigned”.
Make a factor:
## [1] 1 2
## Levels: 1 2
Set contrasts attribute:
## [1] 1 2
## attr(,"contrasts")
## unordered
## contr.treatment
## Levels: 1 2
## [1] 1 2
## Levels: 1 2
dang’t where did those contrasts go!
13.4.1 Generics
In the example:
The book states: “If you wonder why we have to repeat my_new_generic twice, think back to Section 6.2.3”
Can we go over why?
I think he’s referring to the fact that you have to use a string representation inside the UseMethod call.
I’m guessing part of the answer to this is baked into this warning on the UseMethod function docs:
Warning: Prior to R 2.1.0 UseMethod accepted a call with no arguments and tried to deduce the generic from the context.
Basically, you’re specifying the name of the function and then telling R to go find this generic function’s methods. You can’t self-reference the name of the generic after it’s been assigned to itself?
Maybe the name doesn’t exist!
test.default <- function() {
"test"
}
# not sure why you would want to do this, but you can
(function() {
UseMethod("test")
})()
## [1] "test"
Also because you might have assigned your generic to a new name, how would it know to use the true “generic” name.
superuglygenericfunction <- function() {
UseMethod("superuglygenericfunction")
}
superuglygenericfunction.default <- function() {
"ugh"
}
nice <- function() {
UseMethod("superuglygenericfunction")
}
nice()
## [1] "ugh"
13.4.4.5 Exercises
Can we explain what’s happening here in our own words I was a little confused with the solutions manual? g(x)
uses 1 from the .default
method but 10
from the generic? Why?
g <- function(x) {
x <- 10
y <- 10
UseMethod("g")
}
g.default <- function(x) c(x = x, y = y)
x <- 1
y <- 1
g(x)
x y
1 10
UseMethod creates a new function call with arguments matched as they came in to the generic. Any local variables defined before the call to UseMethod
are retained (unlike S).
When you call g(x)
even though you appear to be ignoring it in your definition of g
and assigning x <- 10
, when you call UseMethod
it will basically match the x
in the g.default
formals to the x
that was passed in. The y
isn’t matched to any argument and is retained.
Tyler Grant Smith:m: 13 minutes ago r g <- function(x) { x <- 10 y <- 10 UseMethod(“g”) } g.default <- function(x = NULL) c(x = x, y = y) x <- 1 y <- 1 # this is interesting behavior g() #> y #> 10 g(x) #> x y #> 1 10
13.5 Object styles
Unfortunately, describing the appropriate use of each of these object styles is beyond the scope of this book.
Can we use the vctrs
package in a concrete example to address some of these fringe cases and reasons why length(x)
is causing an issue?
An object style is a pattern for structuring your class. You can use any structure you want but these 4 patterns get used very often. Date and factor are both vector style for example: the core data for objects of both types are stored in a single vector. And there are some other context-appropriate things denoted by attributes in each case.
I don’t think length is an issue but when you move away from the vanilla vector style object then the meaning changes. For example, under the hood, a data frame is really a list of vectors. Each of the vectors (columns) has the same length. So if you take the length of a data frame you get the length of the list i.e. the number of columns.
## [1] 11
The result of calling length
on a data.frame isn’t necessarily intuitive: most people would probably say that the number of rows is a more natural value for the length of a data frame.
This is the same for dates: which are actually of length 3
when classes and 9
when unclassed.
## [1] 3
## [1] 11
13.6 Inheritance
Can we make our own example of a method with inheritance?
Creating an IBU calculator for beer, we can use the .ipa
method for a beer of class .wipa
:
library(sloop)
# IBUs are calculated as onces x alpha acid x boil time / constant
# we're going to use a different constant for IPAs and Stouts
# generic
IBU_calculator <- function(x) {
UseMethod("IBU_calculator")
}
# default
IBU_calculator.default <- function(x){
stop("Can only calculate IBU for beers, not ",class(x))
}
# ipa
IBU_calculator.ipa <- function(x) {
1 * x * 30 / 7.25
}
# stout
IBU_calculator.stout <- function(x) {
1.5 * x * 30 / 6
}
#### some beers ####
ipa_1 <- structure(
15, class = c("ipa","beer")
)
# define a wine to see default dispatch
wine_1 <- structure(3,class = c("pinot","wine"))
# wipa "inherits" class ipa so we can use that method!
wipa_1 <- structure(20, class = c("wipa","ipa","beer"))
s3_dispatch(IBU_calculator(ipa_1))
## => IBU_calculator.ipa
## IBU_calculator.beer
## * IBU_calculator.default
## IBU_calculator.wipa
## => IBU_calculator.ipa
## IBU_calculator.beer
## * IBU_calculator.default
## IBU_calculator.pinot
## IBU_calculator.wine
## => IBU_calculator.default
13.4.3.3 Exercises
Can we go over why this doesn’t return a1
?
generic2 <- function(x) UseMethod("generic2")
generic2.a1 <- function(x) "a1"
generic2.a2 <- function(x) "a2"
generic2.b <- function(x) {
class(x) <- "a1"
NextMethod()
}
generic2(structure(list(), class = c("b", "a2")))
[1] "a2"
generic2 <- function(x) UseMethod("generic2")
generic2.a1 <- function(x) { cat("generic2.a1\n"); cat(class(x)) }
generic2.a2 <- function(x) { cat("generic2.a2\n"); cat(class(x)) }
generic2.b <- function(x) {
cat("generic2.b\n")
cat("Regardless of what we do, NextMethod will search for the following",
"classes:\n", paste0(.Class, collapse = ", "), "\n")
class(x) <- "a1"
NextMethod()
}
x <- structure(1:1e8, class = c("b", "a2"))
generic2(x)
## generic2.b
## Regardless of what we do, NextMethod will search for the following classes:
## b, a2
## generic2.a2
## a1
13.7.1 S3 and Base Types
WHY isn’t the dispatch of mean(x1) determined by its type???
[1] "integer"
mean.integer
mean.numeric
=> mean.default
## [1] mean.Date mean.default mean.difftime mean.POSIXct mean.POSIXlt
## [6] mean.quosure*
## see '?methods' for accessing help and source code
mean doesn’t actually have a specific method defined in base R for int or numeric. the default method handles all base types, type checking along the way:
## function (x, trim = 0, na.rm = FALSE, ...)
## {
## if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {
## warning("argument is not numeric or logical: returning NA")
## return(NA_real_)
## }
## if (na.rm)
## x <- x[!is.na(x)]
## if (!is.numeric(trim) || length(trim) != 1L)
## stop("'trim' must be numeric of length one")
## n <- length(x)
## if (trim > 0 && n) {
## if (is.complex(x))
## stop("trimmed means are not defined for complex data")
## if (anyNA(x))
## return(NA_real_)
## if (trim >= 0.5)
## return(stats::median(x, na.rm = FALSE))
## lo <- floor(n * trim) + 1
## hi <- n + 1 - lo
## x <- sort.int(x, partial = unique(c(lo, hi)))[lo:hi]
## }
## .Internal(mean(x))
## }
## <bytecode: 0x7fd5257a7438>
## <environment: namespace:base>
13.7.3 Internal Generics
Defining a single group generic for your class overrides the default behaviour for all of the members of the group.
Does this mean that if I define abs for my class it overrides the behaviour of sign, sqrt and all the others in the Math group?
Essentially, these are all (loosely) math-related functions, so R understands how to do these for anything it can understand as a numeric (somehow). If you’re adding a class that isn’t handled in one of these numeric-ish ways, you’ll need to tell R how to handle the mathematics. You can’t make a group like “Math”, “Ops”, “Summary”, or “Complex” and expect R to know what it’s doing.
If you define abs
for your class it won’t know how to do the other things. Suppose you defined your own min.POSIXct
then when you call min(Sys.time())
it will encounter your function, evaluate, and then that is the end. Otherwise, min would call the min Primitive which is implemented in C as found here and if you trace through that you will see that is where the GroupDispatch
is called.
13.7.4 Double Dispatch
What is happening here:
Ops.foo <- function(...) print("Called twice for some reason?")
x <- structure(2, class = "foo")
x > 0
[1] "Called twice for some reason?"
[1] "Called twice for some reason?"
Double dispatch – where it’s calling the same method on x and zero. The inequality needs to convert both to the same class.
The book mentions that we can use vctrs::vec_arith
to avoid the aboce, but does that pertain to this case? I tried: vctrs::vec_arith(">", x, 0)
But I think my understanding is still off…
The vec_arith
generic will first dispatch on class(x)
So, in this case it tries to find vec_arith.foo
but we haven’t defined that yet, so it instead finds vec_arith.default
which signals the error.
[1] TRUE
Error: <foo> > <double> is not permitted
vec_arith will dispatch to this first, because it finds the "foo" generic
vec_arith.foo <- function(op, x, y, ...) {
# now it will dispath on y, which in this case is a numeric.
UseMethod("vec_arith.foo", y)
}
vec_arith(">", x, 0)
Error in UseMethod("vec_arith.foo", y): no applicable method for 'vec_arith.foo' applied to an object of class "c('double', 'numeric')"
Uh oh another error, and this is super confusing because none of our objects are class double. (typeof(x) is “double”). The problem is that vec_arith.foo further dispatches on the second argument and it is looking for vec_arith.numeric
or vec_arith.default
, but we haven’t defined that yet.
vec_arith.foo.numeric <- function(op, x, y, ...) {
vctrs::vec_arith_base(op, x, y, ...)
}
vec_arith(">", x, 0)
[1] TRUE
Error: <double> > <foo> is not permitted
now vec_arith will first dispatch on the 0 and scope to vec_arith.numeric which will then try to dispatch to vec_arith.numeric.foo which we havent defined yet.
vec_arith.numeric.foo <- function(op, x, y, ...) {
vctrs::vec_arith_base(op, x, y, ...)
}
vec_arith(">", 0, x)
[1] FALSE
What is happening here?!
Math.bar <- function(x) {
print("Math bar")
NextMethod(.Generic)
}
Math.foo <- function(x) {
print("Math foo")
NextMethod(.Generic)
}
x <- 1
class(x) <- c("foo", "bar", class(x))
abs(x)
`abs.foo` <- function(x) {
print("Generic foo")
NextMethod()
}
abs(x)
`abs.bar` <- function(x) {
print("Generic bar")
NextMethod()
}
abs(x)
rm(abs.foo)
# infinite recursion continually calling abs.bar, only calling Math.foo once
abs(x)
rm(Math.foo)
# back to working
abs(x)