Tautology:
Tautology: Object-oriented programming is a programming paradigm centered around objects
Tautology: Object-oriented programming is a programming paradigm centered around objects
What's an object?
Tautology: Object-oriented programming is a programming paradigm centered around objects
What's an object?
Objects are collections of data and methods
every object has a type (class)
the class of the object determines:
its attributes
how you can interact with the object
Tautology: Object-oriented programming is a programming paradigm centered around objects
What's an object?
Objects are collections of data and methods
every object has a type (class)
the class of the object determines:
its attributes
how you can interact with the object
The key idea: the nature of the object tells you how you can interact with the object i.e. what functions you can use
The main reason:
The main reason: Polymorphism
The main reason: Polymorphism
Polymorphism allows developer to think about a function's interface separately.
It's complicated. Mostly because there's a lot of different ways to do it.
It's complicated. Mostly because there's a lot of different ways to do it.
Different OOP systems in R include:
S3
R6
S4
It's complicated. Mostly because there's a lot of different ways to do it.
Different OOP systems in R include:
S3
R6
S4
What's meant by an OOP system?
It's complicated. Mostly because there's a lot of different ways to do it.
Different OOP systems in R include:
S3
R6
S4
What's meant by an OOP system?
In R the term object gets used in two different ways:
Everything is an object
R has object-oriented systems: S3, R6, S4
In R the term object gets used in two different ways:
Everything is an object
R has object-oriented systems: S3, R6, S4
Main thing: not every object is object-oriented
No:
No:
No:
new types are:
impossible for application developers to create
rarely created by R-core
OO objects have a class attribute
Telling the difference between base
23 base types listed in Chapter 12:
Vectors: NULL
(NILSXP), logical
(LGLSXP), integer
(INTSXP), double
(REALSXP), complex
(CPLXSXP), character
(STRSXP), list
(VECSXP), raw
(RAWSXP)
Functions: closure
(regular R functions, CLOSXP), special
(internal functions, SPECIALSXP), builtin
(primitive functions, BUILTINSXP)
Environments: environment
(ENVSXP)
S4: S4
(S4SXP)
Language Components: symbol
(aka name, SYMSXP), language
(usually called calls, LANGSXP), pairlist
(used for function arguments, LISTSXP), expression
(EXPRSXP)
Esoteric: externalptr
(EXTPTRSXP), weakref
(WEAKREFSXP), bytecode
(BCODESXP), promise
(PROMSXP), ...
(DOTSXP), any
(ANYSXP).
CRAN's R Internals Guide also lists:
2 previously used base types for internal factors and ordered factors have been withdrawn
R internally uses a type CHARSXP to represent strings
numeric
base typeSometimes used to mean the double
type
In S3 and S4, can be used to mean either integer
or double
is.numeric
is used to identify objects that behave like numbers. (As opposed to whether or not their type is integer.)
S3 is R's oldest OO system:
minimalist
very flexible
most-commonly used system in CRAN packages
the only OOP system used in base
and stats
packages
a lot different to most object-oriented systems in widely-used languages
What do you need for an S3 object?
What do you need for an S3 object?
What do you need for an S3 object?
a base type with a class attribute
that's it
What do you need for an S3 object?
a base type with a class attribute
that's it
There are no checks for correctness in S3
f <- factor(c("a", "b", "c"))typeof(f)
[1] "integer"
attributes(f)
$levels[1] "a" "b" "c"$class[1] "factor"
unclass
You can get the base type of an S3 object using unclass
unclass(f)
[1] 1 2 3attr(,"levels")[1] "a" "b" "c"
In the wild S3 objects can be said to be of different styles:
Vector style objects
length(x)
represents the number of observations in the vectorRecord style objects
data frames
scalar objects: use a list to represent a single thing
Chapter places emphasis on illustrating concepts using vector styled objects.
A POSIXlt
object consists of 11 vectors: sec, min, hour, mday, mon, year, wday, yday, isdst, zone, gmtoff
x <- as.POSIXlt(ISOdatetime(2020, 1, 1, 0, 0, 1:3))unclass(x)[1:6]
$sec[1] 1 2 3$min[1] 0 0 0$hour[1] 0 0 0$mday[1] 1 1 1$mon[1] 0 0 0$year[1] 120 120 120
unclass(x)[7:11]
$wday[1] 3 3 3$yday[1] 0 0 0$isdst[1] 0 0 0$zone[1] "AST" "AST" "AST"$gmtoff[1] -14400 -14400 -14400
The attributes of a POSIXlt
object
attributes(x)
$names [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" [9] "isdst" "zone" "gmtoff"$class[1] "POSIXlt" "POSIXt" $tzone[1] "" "AST" " "
lm
mod <- lm(mpg ~ wt, data = mtcars)
Advanced R recommends a 3-level structure for defining S3 classes:
constructor, new_myclass()
: efficiently creates new objects with the correct structure.
validator, validate_myclass()
: performs more computationally expensive checks to ensure that the object has correct values.
helper, myclass()
: provides a convenient way for others to create objects of your class.
new_factor <- function(x = integer(), levels = character()) { stopifnot(is.integer(x)) stopifnot(is.character(levels)) structure( x, levels = levels, class = "factor" )}
validate_factor <- function(x) { values <- unclass(x) levels <- attr(x, "levels") if (!all(!is.na(values) & values > 0)) { stop( "All `x` values must be non-missing and greater than zero", call. = FALSE ) } if (length(levels) < max(values)) { stop( "There must be at least as many `levels` as possible values in `x`", call. = FALSE ) } x}
factor <- function(x = character(), levels = unique(x)) { ind <- match(x, levels) validate_factor(new_factor(ind, levels))}
To interact with an S3 object, you have to use functions. There are two types of functions involved:
generic functions:
methods:
print
We can use sloop::s3_methods_generic
to see the methods associated with a generic function.
sloop::s3_methods_generic("print")
We can see that there are many methods defined that can potentially be called when the generic print
function is called.
sloop::s3_dispatch
library(sloop)x <- matrix(1:10, nrow = 2)s3_dispatch(mean(x))
mean.matrix mean.integer mean.numeric=> mean.default
s3_dispatch(print(ordered("x")))
print.ordered=> print.factor * print.default
s3_dispatch(print(Sys.time()))
=> print.POSIXct print.POSIXt * print.default
If you want to write your own method, there are 2 cases:
There's a pre-existing generic function
generic.class
There is not a pre-existing generic function.
Create a new generic
Create a method
my_new_generic <- function(x) { UseMethod("my_new_generic")}
UseMethod
worksUseMethod
:
creates a vector of method names, paste0("generic", ".", c(class(x), "default"))
looks for each member of the vector in turn
Example:
x <- Sys.time()class(x)
[1] "POSIXct" "POSIXt"
s3_dispatch(sum(Sys.time()))
sum.POSIXct sum.POSIXt sum.default=> Summary.POSIXct Summary.POSIXt Summary.default-> sum (internal)
the class attribute can be a vector
if a method is not found in the first item in the class vector, R will look for a method for the second and so on ...
a method can delegate work by calling NextMethod
. s3_dispatch
reports delegation with ->
First, create a vector of the class ordered
my_vector <- ordered(c("x", "y"))class(my_vector)
[1] "ordered" "factor"
ordered is a subclass of factor
First, create a vector of the class ordered
my_vector <- ordered(c("x", "y"))class(my_vector)
[1] "ordered" "factor"
ordered is a subclass of factor
Now subset the created vector
First, create a vector of the class ordered
my_vector <- ordered(c("x", "y"))class(my_vector)
[1] "ordered" "factor"
ordered is a subclass of factor
Now subset the created vector
s3_dispatch(my_vector[1])
[.ordered=> [.factor [.default-> [ (internal)
[.ordered
was not available, so R moved on to [.factor
which then delegated to [
S3 imposes no constraints on the relationship between sub and superclasses
Recommended practice
The base type of the subclass should be that same as the superclass.
The attributes of the subclass should be a superset of the attributes of the superclass.
There are a few situations where method dispatch gets weird:
internal generics
group generics
double dispatch
The class attribute of a base object does not uniquely determine the method called
x1 <- 1:5class(x1)
[1] "integer"
s3_dispatch(mean(x1))
mean.integer mean.numeric=> mean.default
x2 <- structure(x1, class = "integer")class(x2)
[1] "integer"
s3_dispatch(mean(x2))
mean.integer=> mean.default
Dispatch is actually done using the implicit object.
The implicit object is based on:
The string “array” or “matrix” if the object has dimensions
The result of typeof()
with a few minor tweaks
The string “numeric” if object is “integer” or “double”
We can use sloop::s3_class
to get the implicit object type
s3_class(x1)
[1] "integer" "numeric"
s3_class(x2)
[1] "integer"
Some base functions, like [
, sum()
, and cbind()
, are called internal generics
they don’t call UseMethod()
instead call the C functions DispatchGroup()
or DispatchOrEval()
s3_dispatch(Sys.time()[1])
=> [.POSIXct [.POSIXt [.default-> [ (internal)
There are 4 group generics:
Math: abs
, sign
, sqrt
, floor
, cos
, sin
, log
, etc
Ops: +
, -
, *
, /
, ^
, %%
, %/%
, &
, |
, !
, ==
, !=
, <
, <=
, >=
, and >
.
Summary: all
, any
, sum
, prod
, min
, max
, and range
.
Complex: Arg
, Conj
, Im
, Mod
, Re
There are 4 group generics:
Math: abs
, sign
, sqrt
, floor
, cos
, sin
, log
, etc
Ops: +
, -
, *
, /
, ^
, %%
, %/%
, &
, |
, !
, ==
, !=
, <
, <=
, >=
, and >
.
Summary: all
, any
, sum
, prod
, min
, max
, and range
.
Complex: Arg
, Conj
, Im
, Mod
, Re
My understanding: you write Math.class
(or Ops.class
, Summary.class
, or Complex.class
) and it becomes a candidate if any of the group members gets called on your class.
Defining a single group generic for your class overrides the default behaviour for all of the members of the group.
Most group generics involve a call to NextMethod()
Math.difftime <- function(x, ...) { new_difftime(NextMethod(), units = attr(x, "units"))}
double dispatch: a special dispatch procedure required by members of the Ops group because:
Many of the functions in the Ops groups are binary operators
the method called should make sense considering both operands
many operators should be commutative
Example:
date <- as.Date("2017-01-01")integer <- 1L
date + integer
[1] "2017-01-02"
integer + date
[1] "2017-01-02"
Look up the methods for each operator. There are 3 cases:
The methods are the same: use that method
The methods are different: use the internal method with a warning.
One method is internal: use the other method.
Tautology:
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |