Benefits of parquet

  • Smaller files than CSV (efficient encodings + compression)
  • Stores datatypes (vs CSV storing all as character & guessing)
  • “Column-oriented” (“thinks” like a dataframe)
  • Splits data into chunks you can (often) skip (faster)

But:

  • Not human-readable