1.5 Predicting ridership on Chicago

This set will be widely used in the book to predict the number of people entering a train station daily.

library(modeldata)
modeldata::Chicago %>% head
## # A tibble: 6 × 50
##   rider…¹ Austin Quinc…² Belmont Arche…³ Oak_P…⁴ Western Clark…⁵ Clinton Merch…⁶
##     <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
## 1   15.7   1.46    8.37     4.60   2.01    1.42     3.32   15.6    2.40    6.48 
## 2   15.8   1.50    8.35     4.72   2.09    1.43     3.34   15.7    2.40    6.48 
## 3   15.9   1.52    8.36     4.68   2.11    1.49     3.36   15.6    2.37    6.40 
## 4   15.9   1.49    7.85     4.77   2.17    1.44     3.36   15.7    2.42    6.49 
## 5   15.4   1.50    7.62     4.72   2.06    1.42     3.27   15.6    2.42    5.80 
## 6    2.42  0.693   0.911    2.27   0.624   0.426    1.11    2.41   0.814   0.858
## # … with 40 more variables: Irving_Park <dbl>, Washington_Wells <dbl>,
## #   Harlem <dbl>, Monroe <dbl>, Polk <dbl>, Ashland <dbl>, Kedzie <dbl>,
## #   Addison <dbl>, Jefferson_Park <dbl>, Montrose <dbl>, California <dbl>,
## #   temp_min <dbl>, temp <dbl>, temp_max <dbl>, temp_change <dbl>, dew <dbl>,
## #   humidity <dbl>, pressure <dbl>, pressure_change <dbl>, wind <dbl>,
## #   wind_max <dbl>, gust <dbl>, gust_max <dbl>, percip <dbl>, percip_max <dbl>,
## #   weather_rain <dbl>, weather_snow <dbl>, weather_cloud <dbl>, …