Parquet > CSV

  • Slow: Manipulating large CSV datasets with {readr}
  • Faster: Manipulating large CSV datasets with {arrow}
  • Much faster: Manipulating large parquet datasets with {arrow}
    • Data subdivided into multiple files