10.5 Overfitting

Overfitting can arise because of:

  • hypeparamets overfitting
  • feature selection overfitting

Solutions:

  • Solution to hypeparamets overfitting is to evaluate the tuning parameters on a data set that is not used to estimate the model parameters (via validation or assessment sets).

  • Solution to feature selection overfitting is applying a resampling process. To consider is when:

    • feature selection is external to the resampling
    • feature selection is inside to the resampling:
      • the process provides a more realistic estimate of predictive performance
      • increase in computational burden