4.1 Weighted Regression

Used to give certain records (variables, features) more or less weighting when fitting the regression model. To show this, I will use the ames housing data from the {modeldata} package from tidymodels and prioritize sale prices of houses sold more recently than those sold earlier in these data.

dat <- ames %>% 
  dplyr::select(Lot_Area, Neighborhood, Year_Sold, First_Flr_SF, Second_Flr_SF,
         Bsmt_Full_Bath, Full_Bath, Half_Bath, Bsmt_Half_Bath, Sale_Price,
         Bedroom_AbvGr, Central_Air, Bldg_Type) %>% 
  dplyr::mutate(weight = Year_Sold - 2006,
         total_sf = First_Flr_SF + Second_Flr_SF,
         bath = Bsmt_Full_Bath + Full_Bath + 0.5*Half_Bath + 0.5*Bsmt_Half_Bath)


house_lm <- lm(Sale_Price ~ total_sf + Lot_Area + bath + Bedroom_AbvGr + Central_Air,
               data = dat)
house_wt <- lm(Sale_Price ~ total_sf + Lot_Area + bath + Bedroom_AbvGr + Central_Air,
               data = dat, weight = weight)
round(cbind(house_lm = house_lm$coefficients,
                house_wt = house_wt$coefficients), digits = 3)
##                 house_lm   house_wt
## (Intercept)    -4804.696  -3943.203
## total_sf         104.742    104.705
## Lot_Area           0.605      0.673
## bath           25411.278  24754.042
## Bedroom_AbvGr -25438.440 -27218.685
## Central_AirY   41924.976  45938.683