The challenge (1)

  • most real data is untidy: to facilitate some goal other than analysis
  • so you often need to tidy the original data. This takes two steps:
    1. determine the underlying variables and observations
    2. pivot your data into a tidy form