10.9 Useful Examples - Histograms and binwidth
Useful when…
- You need to pass a function
- You don’t want to have to re-write the function every time (the default behaviour of the function should be flexible)
For example, these bins are not appropriate
sd <- c(1, 5, 15)
n <- 100
df <- data.frame(x = rnorm(3 * n, sd = sd), sd = rep(sd, n))
ggplot(df, aes(x)) +
geom_histogram(binwidth = 2) +
facet_wrap(~ sd, scales = "free_x") +
labs(x = NULL)
We could just make a function…
binwidth_bins <- function(x) (max(x) - min(x)) / 20
ggplot(df, aes(x = x)) +
geom_histogram(binwidth = binwidth_bins) +
facet_wrap(~ sd, scales = "free_x") +
labs(x = NULL)
But if we want to change the number of bins (20) we’d have to re-write the function each time.
If we use a factory, we don’t have to do that.
binwidth_bins <- function(n) {
force(n)
function(x) (max(x) - min(x)) / n
}
ggplot(df, aes(x = x)) +
geom_histogram(binwidth = binwidth_bins(20)) +
facet_wrap(~ sd, scales = "free_x") +
labs(x = NULL, title = "20 bins")
ggplot(df, aes(x = x)) +
geom_histogram(binwidth = binwidth_bins(5)) +
facet_wrap(~ sd, scales = "free_x") +
labs(x = NULL, title = "5 bins")
Similar benefit in Box-cox example