Forward Stepwise Subset Selection (FsSS)

  1. Let \(\mathcal{M}_0\) denote the null model (no predictors)
  2. For \(k = 1, ..., p\):
  • Fit all \(p - (k - 1)\) predictors not in model \(\mathcal{M}_{k - 1}\)
  • Select the predictor that raises \(R^2\) the most and add it to model \(\mathcal{M}_{k - 1}\) to create model \(\mathcal{M}_k\)
  1. Select the model among \(\mathcal{M}_0, ..., \mathcal{M}_k\) that minimizes validation error (or some estimate of it)
  • When \(p = 20\), best subset selection requires fitting 1,048,576 models, whereas forward stepwise selection requires fitting only 211 models.