24.4 Do as little as possible
- use a function tailored to a more specific type of input or output, or to a more specific problem
rowSums()
,colSums()
,rowMeans()
, andcolMeans()
are faster than equivalent invocations that useapply()
because they are vectorisedvapply()
is faster thansapply()
because it pre-specifies the output typeany(x == 10)
is much faster than10 %in% x
because testing equality is simpler than testing set inclusion
- Some functions coerce their inputs into a specific type. If your input is not the right type, the function has to do extra work
- e.g.
apply()
will always turn a dataframe into a matrix
- e.g.
- Other examples
read.csv()
: specify known column types withcolClasses
. (Also consider switching toreadr::read_csv()
ordata.table::fread()
which are considerably faster thanread.csv()
.)factor()
: specify known levels withlevels
.cut()
: don’t generate labels withlabels = FALSE
if you don’t need them, or, even better, usefindInterval()
as mentioned in the “see also” section of the documentation.unlist(x, use.names = FALSE)
is much faster thanunlist(x)
.interaction()
: if you only need combinations that exist in the data, usedrop = TRUE
.