1.1 Structured Data

  • Software classifies data by type.
    • Numeric (continuous or discrete)
    • Categorical (binary, ordinal, neither)
  • Rectangular data = typical frame of reference for data science.
    • Called a data.frame in R
    • Rows are records (aka observations, cases, instances)
    • Columns are features (aka variables, attributes, predictors in some cases)
    • Lots of synonyms in stats and data science for same things.