10.4 Irrelevant features
Predictability
- type of model
- nature of the predictors
- ratio of the size of the training set to the number of predictors
This simulation system from Sapp et al. (2014) is an example of nonlinear function of 20 predictors:
\[y=x_1+sin(x_2)+log(|x_3|)+x_4^2+x_5x_6+I(x_7x_8x_9<0)+I(x_{10}>0)+\\ x_{11}I(x_{1}1>0)+\sqrt{|x_{12}|}+cos(x_{13})+2x_{14}+|x_{15}|+\\I(x_{16}< -1)+I(x_{17< -1})-2x_{18}-x_{19}x_{20}+\epsilon\]
Each of the \(x_i\) are generated from an independent standard normal random variable and the \(\epsilon\), the error as a random normal \(\epsilon\sim N(0,3)\).
And between 10 and 200 extra variables are added.
source: FES Selection Simulation