8.5 Linear Regression
Linear regression allows us to explore the relationship between a quantitative response variable and an explanatory variable while other variables are held constant.
Below is a model to predict home prices, the response variable, by the explanatory variables, lot size (\(ft^2\)), age (yrs), land value (\(\$1000s\)), living area (\(ft^2\)), number of bedrooms and bathrooms and whether the home is on the waterfront or not.
data(SaratogaHouses, package="mosaicData")
houses_lm <- lm(price ~ lotSize + age + landValue +
livingArea + bedrooms + bathrooms +
waterfront,
data = SaratogaHouses)
summary(houses_lm)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.398788e+05 1.647293e+04 8.491436 4.345065e-17
## lotSize 7.500792e+03 2.075136e+03 3.614604 3.094673e-04
## age -1.360401e+02 5.415794e+01 -2.511914 1.209876e-02
## landValue 9.093072e-01 4.583046e-02 19.840672 4.716289e-79
## livingArea 7.517866e+01 4.158113e+00 18.079993 4.954903e-67
## bedrooms -5.766760e+03 2.388433e+03 -2.414454 1.586262e-02
## bathrooms 2.454711e+04 3.332268e+03 7.366487 2.705486e-13
## waterfrontNo -1.207266e+05 1.560083e+04 -7.738475 1.703303e-14
We estimate that an increase of one square foot of living area is associated with a home price increase of \(\$75\), holding the other variables constant.
We estimate that a waterfront home costs approximately \(\$120,726\) more than non-waterfront home, again controlling for the other variables in the model.