22.10 Rewriting the Seattle library data (1)

  • Here, we partition by checkoutYear
  • Use dplyr::group_by() to define partitions and save them to directory with arrow::write_dataset()
pq_path <- "data/seattle-library-checkouts"
# may take long, but makes future work faster
seattle_csv |>
  group_by(CheckoutYear) |>
  write_dataset(path = pq_path, format = "parquet")