7.2 Statistical and Machine Learning Methods

  • Several pre-analysis steps are common to many methods

7.2.1 Exploratory Data Analysis

  • Aim is to understand the data
  • Descriptive statistics of central tendencies and variation
  • Basic plots of distributions / skewness (histograms)
  • Correlation plots

7.2.2 Feature Engineering / Transforming Variables

  • Reducing skew (log or other transformation)
  • Encoding category variables as dummies
  • Creating new predictor variables, interaction terms
  • Centering / Scaling Variables