Managing data flow
inputs -> Black Box -> outputs
Try to follow Base-R behavior (even if sometimes R function are not consistent).
9.0.1 Exerice 9.1
sample.int("Spanish Inquisition")
# Error: attempt to use zero-length variable name
seq_along(list(1, 2, 3))
# [1] 1 2 3
rep("bob", -1)
#
rep("bob", -1)
# Error: attempt to use zero-length variable name
rep("bob", TRUE)
rep("bob", FALSE)
strrep("bob", NA)
# NA
train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3])
test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))
class::knn(train, test, cl, k = numeric(0), prob=TRUE)
# Error: attempt to use zero-length variable name
Even if “some” functions check data integrity for us what happens when we multiple functions:
round_rand <- function(n, d)
{
x <- runif(n) # runif will check if `n` makes sense
round(x, d) # round will determine the appropriateness of `d`
}
Solution:
round_rand2 <- function(n, d)
{
stopifnot(
is.numeric(n), length(n) == 1,
is.finite(n), n > 0, n == floor(n),
is.numeric(d), length(d) == 1,
is.finite(d), d > 0, d == floor(d)
)
x <- runif(n)
round(x, d)
}
round_rand2(3, -1)
In some cases, maybe you want to be more permissif:
9.0.2 Exercice 9.6
A vectorised mathematical function:
- empty vector? non-numeric input? -> No
- what if it is equipped with the names attribute? -> should not mind them
an aggregation function
- what about missing values? -> warnings and
na.rm = TRUE
- empty vector -> No, Error
- what about missing values? -> warnings and
a function vectorised with regard to two arguments:
- vectorized, unsure elemtwise
- vector vs vector of the same length allowed? -> yes but warnings
a function vectorised with respect to all arguments: No see
na.rm=TRUE
a function vectorised with respect to the first argument but not the second: plenty of cases
9.0.3 Putting outputs into context
we ought to generate outputs of a predictable kind
IE: return same kind of object, do one task