Slice functions

To extract specific rows within each group you can use the following functions:

  • df |> slice_head(n = 1) takes the first row from each group.
  • df |> slice_tail(n = 1) takes the last row in each group.
  • df |> slice_min(x, n = 1) takes the row with the smallest value of column x.
  • df |> slice_max(x, n = 1) takes the row with the largest value of column x.
  • df |> slice_sample(n = 1) takes one random row.

You can use n to select multiple rows or use prop = 0.1 to select 10% of the rows in each group. For example, the code finds the most delayed flights upon arrival at each destination.

flights |> 
  group_by(dest) |> 
  slice_max(arr_delay, n = 1) |>
  relocate(dest)
## # A tibble: 108 × 19
## # Groups:   dest [105]
##    dest   year month   day dep_time sched_dep_time dep_delay arr_time
##    <chr> <int> <int> <int>    <int>          <int>     <dbl>    <int>
##  1 ABQ    2013     7    22     2145           2007        98      132
##  2 ACK    2013     7    23     1139            800       219     1250
##  3 ALB    2013     1    25      123           2000       323      229
##  4 ANC    2013     8    17     1740           1625        75     2042
##  5 ATL    2013     7    22     2257            759       898      121
##  6 AUS    2013     7    10     2056           1505       351     2347
##  7 AVL    2013     8    13     1156            832       204     1417
##  8 BDL    2013     2    21     1728           1316       252     1839
##  9 BGR    2013    12     1     1504           1056       248     1628
## 10 BHM    2013     4    10       25           1900       325      136
## # ℹ 98 more rows
## # ℹ 11 more variables: sched_arr_time <int>, arr_delay <dbl>, carrier <chr>,
## #   flight <int>, tailnum <chr>, origin <chr>, air_time <dbl>, distance <dbl>,
## #   hour <dbl>, minute <dbl>, time_hour <dttm>

In the above output we get 108 rows but there are only 105 destinations. slice_min() and slice_max() keep tied values so n = 1 means give us all rows with the highest value, if you want only one row per group you can set with_ties = FALSE.