3.7 Multiple Linear Regression

Multiple linear regression extends simple linear regression for p predictors:

$Y = \beta_{0} + \beta_{1}X_1 + \beta_{2}X_2 + ... +\beta_{p}X_p + \epsilon_i$ - $\beta_{j}$ is the average effect on $Y$ from $X_{j}$ holding all other predictors fixed.

Fit is once again choosing the $\beta_{j}$ that minimizes the RSS.
Example in book shows that although fitting sales against newspaper alone indicated a significant slope (0.055 +- 0.017), when you include radio in a multiple regression, newspaper no longer has any significant effect. (-0.001 +- 0.006)

3.7.1 Important Questions

Is at least one of the predictors $X_1$ , $X_2$ , … , $X_p$ useful in predicting the response?

F statistic close to 1 when there is no relationship, otherwise greater then 1.

$F = \frac{(TSS-RSS)/p}{RSS/(n-p-1)}$

Do all the predictors help to explain $Y$ , or is only a subset of the predictors useful?

p-values can help identify important predictors, but it is possible to be mislead by this especially with large number of predictors. Variable selection methods include Forward selection, backward selection and mixed. Topic is continued in Chapter 6.
How well does the model fit the data?

$R^2$ still gives proportion of the variance explained, so look for values “close” to 1. Can also look at RSE which is generalized for multiple regression as:

$RSE = \sqrt{\frac{1}{n-p-1}RSS}$

Given a set of predictor values, what response value should we predict, and how accurate is our prediction?

Three sets of uncertainty in predictions:
- Uncertainty in the estimates of $\beta_i$
- Model bias
- Irreducible error $\epsilon$