Grouping by multiple variables

To make a group for each date you can create groups using more than one variable.

daily <- flights |>  
  group_by(year, month, day)
daily
## # A tibble: 336,776 × 19
## # Groups:   year, month, day [365]
##     year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
##    <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
##  1  2013     1     1      517            515         2      830            819
##  2  2013     1     1      533            529         4      850            830
##  3  2013     1     1      542            540         2      923            850
##  4  2013     1     1      544            545        -1     1004           1022
##  5  2013     1     1      554            600        -6      812            837
##  6  2013     1     1      554            558        -4      740            728
##  7  2013     1     1      555            600        -5      913            854
##  8  2013     1     1      557            600        -3      709            723
##  9  2013     1     1      557            600        -3      838            846
## 10  2013     1     1      558            600        -2      753            745
## # ℹ 336,766 more rows
## # ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
## #   tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
## #   hour <dbl>, minute <dbl>, time_hour <dttm>

When you summarize a tibble grouped by more than one variable, each summary peels off the last group.

daily_flights <- daily |> 
  summarize(n = n())
## `summarise()` has grouped output by 'year', 'month'. You can override using the
## `.groups` argument.

If this behavior is OK, you can explicitly request it so that it doesn’t give the warning message:

daily_flights <- daily |> 
  summarize(
    n = n(), 
    .groups = "drop_last"
  )

You can change the default behavior by setting a different value, like "drop" to remove all grouping or "keep" to maintain the existing groups.