The problem of indirection explained
- To make the problem a bit more clear, we can use a made up data frame:
df <- tibble(
mean_var = 1,
group_var = "g",
group = 1,
x = 10,
y = 100
)
df |>
grouped_mean(group, x)
## # A tibble: 1 × 2
## group_var `mean(mean_var)`
## <chr> <dbl>
## 1 g 1
df |>
grouped_mean(group, y)
## # A tibble: 1 × 2
## group_var `mean(mean_var)`
## <chr> <dbl>
## 1 g 1
- Regardless of how we call
grouped_mean()
it always does df |> group_by(group_var) |> summarize(mean(mean_var))
, instead of df |> group_by(group) |> summarize(mean(x))
or df |> group_by(group) |> summarize(mean(y))
.
- This is a problem of indirection, and it arises because dplyr uses tidy evaluation to allow you to refer to the names of variables inside your data frame without any special treatment.