24.4 Do as little as possible

use a function tailored to a more specific type of input or output, or to a more specific problem
- rowSums(), colSums(), rowMeans(), and colMeans() are faster than equivalent invocations that use apply() because they are vectorised
- vapply() is faster than sapply() because it pre-specifies the output type
- any(x == 10) is much faster than 10 %in% x because testing equality is simpler than testing set inclusion
Some functions coerce their inputs into a specific type. If your input is not the right type, the function has to do extra work
- e.g. apply() will always turn a dataframe into a matrix
Other examples
- read.csv(): specify known column types with colClasses. (Also consider switching to readr::read_csv() or data.table::fread() which are considerably faster than read.csv().)
- factor(): specify known levels with levels.
- cut(): don’t generate labels with labels = FALSE if you don’t need them, or, even better, use findInterval() as mentioned in the “see also” section of the documentation.
- unlist(x, use.names = FALSE) is much faster than unlist(x).
- interaction(): if you only need combinations that exist in the data, use drop = TRUE.