5.4 Continuous outcome data
For continuous outcome data (e.g. costs), a stratified random sampling approach would involve conducting a 80/20 split within each quartile and then pool the results together. For the Ames housing dataset, the call would look like this:
set.seed(123)
<- initial_split(ames, prop = 0.80, strata = Sale_Price)
ames_split <- training(ames_split)
ames_train <- testing(ames_split) ames_test