22.7 Nonparametric regression and machine learning
Nonparamentric regression - the regression curve is not constrained to follow any particular parametric form.
Machine Learning describes nonparametric regression where the focus is more on prediction rather then parameter estimation. Performance is often assessed on held out ‘test’ data.
To avoid overfitting, nonparametric models use a variety of techniques to constrain the model and Tuning parameters (hyperparameters) govern the amount of constraint, typically optimized using cross-validation.
Some examples of nonparametric models:
Loess - locally weighted regression, tuned by the stength of the weight function.
Splines - nonlinear basis functions , tuning controls the ‘local smoothness’
Gaussian processes - multivariate Gaussian model, tuning controls the correlation distance
Tree models - Decision trees are very powerful nonparametric models, especially gradient boosted trees (e.g. XGBoost).
BART - Bayesian additive regression trees. Somehow includes priors over trees and leaf values. For more, ROS recommends: Bayesian Additive Regression Trees: A Review and Look Forward
Many of these methods are covered or at least introduced in Introduction to Statistical Learning