Forward Stepwise Subset Selection (FsSS)
- Let \(\mathcal{M}_0\) denote the null model (no predictors)
- For \(k = 1, ..., p\):
- Fit all \(p - (k - 1)\) predictors not in model \(\mathcal{M}_{k - 1}\)
- Select the predictor that raises \(R^2\) the most and add it to model
\(\mathcal{M}_{k - 1}\) to create model \(\mathcal{M}_k\)
- Select the model among \(\mathcal{M}_0, ..., \mathcal{M}_k\) that
minimizes validation error (or some estimate of it)
- When \(p = 20\), best subset selection requires fitting 1,048,576
models, whereas forward stepwise selection requires fitting only 211 models.