Putting the pieces together

The first step is where you define your blueprint (aka recipe). With this process, you supply the formula of interest (the target variable, features, and the data these are based on) with recipe() and then you sequentially add feature engineering steps with step_xxx().

  1. Remove near-zero variance features that are categorical (aka nominal).

  2. Ordinal encode our quality-based features (which are inherently ordinal).

  3. Center and scale (i.e., standardize) all numeric features.

  4. One-hot encode our remaining categorical features.

blueprint <- recipe(Sale_Price ~ ., data = ames_train) %>% 
     step_nzv(all_nominal()) %>% 
     step_integer(matches("Qual|Cond|QC|Qu")) %>% 
     step_normalize(all_numeric(), -all_outcomes()) %>% 
     step_dummy(all_nominal(), -all_outcomes(), one_hot = TRUE)

blueprint
## 
## ── Recipe ──────────────────────────────────────────────────────────────────────
## 
## ── Inputs
## Number of variables by role
## outcome:    1
## predictor: 80
## 
## ── Operations
## • Sparse, unbalanced variable filter on: all_nominal()
## • Integer encoding for: matches("Qual|Cond|QC|Qu")
## • Centering and scaling for: all_numeric(), -all_outcomes()
## • Dummy variables from: all_nominal(), -all_outcomes()

Next, we need to train this blueprint on some training data. Remember, there are many feature engineering steps that we do not want to train on the test data (e.g., standardize and PCA) as this would create data leakage. So in this step we estimate these parameters based on the training data of interest.

Lastly, we can apply our blueprint to new data (e.g., the training data or future test data) with bake().

blueprint %>% 
     prep() %>% 
     bake(new_data = ames_train) %>% 
     glimpse()
## Rows: 2,049
## Columns: 220
## $ Lot_Frontage                                          <dbl> -1.13208589, -1.…
## $ Lot_Area                                              <dbl> -1.04082032, -1.…
## $ Condition_1                                           <dbl> -0.04804813, -0.…
## $ Overall_Qual                                          <dbl> -0.77351309, -0.…
## $ Overall_Cond                                          <dbl> -0.4967950, -0.4…
## $ Year_Built                                            <dbl> -0.01478954, -0.…
## $ Year_Remod_Add                                        <dbl> -0.63615965, -0.…
## $ Mas_Vnr_Area                                          <dbl> 2.08710198, 1.48…
## $ Exter_Qual                                            <dbl> 0.6724948, 0.672…
## $ Exter_Cond                                            <dbl> 0.377337, 0.3773…
## $ Bsmt_Qual                                             <dbl> 1.07938607, 1.07…
## $ BsmtFin_SF_1                                          <dbl> 0.8154852, 1.262…
## $ BsmtFin_SF_2                                          <dbl> -0.2907006, -0.2…
## $ Bsmt_Unf_SF                                           <dbl> -0.75734818, -0.…
## $ Total_Bsmt_SF                                         <dbl> -1.18297005, -1.…
## $ Heating_QC                                            <dbl> 1.4517086, 1.451…
## $ First_Flr_SF                                          <dbl> -1.60448826, -1.…
## $ Second_Flr_SF                                         <dbl> 0.072573874, 0.0…
## $ Low_Qual_Fin_SF                                       <dbl> -0.1010541, -0.1…
## $ Gr_Liv_Area                                           <dbl> -0.79558772, -0.…
## $ Bsmt_Full_Bath                                        <dbl> -0.8225024, -0.8…
## $ Bsmt_Half_Bath                                        <dbl> -0.2463384, -0.2…
## $ Full_Bath                                             <dbl> -1.029885, -1.02…
## $ Half_Bath                                             <dbl> 1.2456257, 1.245…
## $ Bedroom_AbvGr                                         <dbl> 0.1637802, 0.163…
## $ Kitchen_AbvGr                                         <dbl> -0.2134855, -0.2…
## $ Kitchen_Qual                                          <dbl> 0.9051278, 0.905…
## $ TotRms_AbvGrd                                         <dbl> -0.2964932, -0.2…
## $ Fireplaces                                            <dbl> -0.9212202, -0.9…
## $ Fireplace_Qu                                          <dbl> -0.07498851, -0.…
## $ Garage_Cars                                           <dbl> -1.0205957, -1.0…
## $ Garage_Area                                           <dbl> -0.71662117, -0.…
## $ Garage_Qual                                           <dbl> 0.3203747, 0.320…
## $ Garage_Cond                                           <dbl> 0.2904161, 0.290…
## $ Wood_Deck_SF                                          <dbl> -0.74509691, -0.…
## $ Open_Porch_SF                                         <dbl> -0.69687402, -0.…
## $ Enclosed_Porch                                        <dbl> -0.3513871, -0.3…
## $ Three_season_porch                                    <dbl> -0.1067458, -0.1…
## $ Screen_Porch                                          <dbl> -0.2886199, -0.2…
## $ Pool_Area                                             <dbl> -0.06496428, -0.…
## $ Misc_Val                                              <dbl> -0.08126785, -0.…
## $ Mo_Sold                                               <dbl> -1.16165655, -1.…
## $ Year_Sold                                             <dbl> 1.67927, 1.67927…
## $ Sale_Condition                                        <dbl> 0.217264, -0.689…
## $ Longitude                                             <dbl> 0.6035963, 0.613…
## $ Latitude                                              <dbl> 0.95162362, 0.95…
## $ Sale_Price                                            <int> 105500, 88000, 1…
## $ MS_SubClass_One_Story_1946_and_Newer_All_Styles       <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_One_Story_1945_and_Older                  <dbl> 0, 0, 0, 0, 1, 0…
## $ MS_SubClass_One_Story_with_Finished_Attic_All_Ages    <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_One_and_Half_Story_Unfinished_All_Ages    <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_One_and_Half_Story_Finished_All_Ages      <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Two_Story_1946_and_Newer                  <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Two_Story_1945_and_Older                  <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Two_and_Half_Story_All_Ages               <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Split_or_Multilevel                       <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Split_Foyer                               <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Duplex_All_Styles_and_Ages                <dbl> 0, 0, 0, 0, 0, 1…
## $ MS_SubClass_One_Story_PUD_1946_and_Newer              <dbl> 0, 0, 1, 1, 0, 0…
## $ MS_SubClass_One_and_Half_Story_PUD_All_Ages           <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Two_Story_PUD_1946_and_Newer              <dbl> 1, 1, 0, 0, 0, 0…
## $ MS_SubClass_PUD_Multilevel_Split_Level_Foyer          <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_SubClass_Two_Family_conversion_All_Styles_and_Ages <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_Zoning_Floating_Village_Residential                <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_Zoning_Residential_High_Density                    <dbl> 0, 0, 0, 0, 1, 0…
## $ MS_Zoning_Residential_Low_Density                     <dbl> 0, 0, 1, 1, 0, 0…
## $ MS_Zoning_Residential_Medium_Density                  <dbl> 1, 1, 0, 0, 0, 1…
## $ MS_Zoning_A_agr                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_Zoning_C_all                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ MS_Zoning_I_all                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Lot_Shape_Regular                                     <dbl> 1, 1, 1, 1, 1, 1…
## $ Lot_Shape_Slightly_Irregular                          <dbl> 0, 0, 0, 0, 0, 0…
## $ Lot_Shape_Moderately_Irregular                        <dbl> 0, 0, 0, 0, 0, 0…
## $ Lot_Shape_Irregular                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Lot_Config_Corner                                     <dbl> 0, 0, 0, 0, 1, 0…
## $ Lot_Config_CulDSac                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Lot_Config_FR2                                        <dbl> 0, 0, 1, 0, 0, 0…
## $ Lot_Config_FR3                                        <dbl> 0, 0, 0, 0, 0, 0…
## $ Lot_Config_Inside                                     <dbl> 1, 1, 0, 1, 0, 1…
## $ Neighborhood_North_Ames                               <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_College_Creek                            <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Old_Town                                 <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Edwards                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Somerset                                 <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Northridge_Heights                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Gilbert                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Sawyer                                   <dbl> 0, 0, 0, 0, 0, 1…
## $ Neighborhood_Northwest_Ames                           <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Sawyer_West                              <dbl> 0, 0, 0, 1, 1, 0…
## $ Neighborhood_Mitchell                                 <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Brookside                                <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Crawford                                 <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Iowa_DOT_and_Rail_Road                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Timberland                               <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Northridge                               <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Stone_Brook                              <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_South_and_West_of_Iowa_State_University  <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Clear_Creek                              <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Meadow_Village                           <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Briardale                                <dbl> 1, 1, 0, 0, 0, 0…
## $ Neighborhood_Bloomington_Heights                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Veenker                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Northpark_Villa                          <dbl> 0, 0, 1, 0, 0, 0…
## $ Neighborhood_Blueste                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Greens                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Green_Hills                              <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Landmark                                 <dbl> 0, 0, 0, 0, 0, 0…
## $ Neighborhood_Hayden_Lake                              <dbl> 0, 0, 0, 0, 0, 0…
## $ Bldg_Type_OneFam                                      <dbl> 0, 0, 0, 0, 1, 0…
## $ Bldg_Type_TwoFmCon                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Bldg_Type_Duplex                                      <dbl> 0, 0, 0, 0, 0, 1…
## $ Bldg_Type_Twnhs                                       <dbl> 1, 1, 1, 0, 0, 0…
## $ Bldg_Type_TwnhsE                                      <dbl> 0, 0, 0, 1, 0, 0…
## $ House_Style_One_and_Half_Fin                          <dbl> 0, 0, 0, 0, 0, 1…
## $ House_Style_One_and_Half_Unf                          <dbl> 0, 0, 0, 0, 0, 0…
## $ House_Style_One_Story                                 <dbl> 0, 0, 1, 1, 1, 0…
## $ House_Style_SFoyer                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ House_Style_SLvl                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ House_Style_Two_and_Half_Fin                          <dbl> 0, 0, 0, 0, 0, 0…
## $ House_Style_Two_and_Half_Unf                          <dbl> 0, 0, 0, 0, 0, 0…
## $ House_Style_Two_Story                                 <dbl> 1, 1, 0, 0, 0, 0…
## $ Roof_Style_Flat                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Roof_Style_Gable                                      <dbl> 1, 1, 1, 1, 1, 1…
## $ Roof_Style_Gambrel                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Roof_Style_Hip                                        <dbl> 0, 0, 0, 0, 0, 0…
## $ Roof_Style_Mansard                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Roof_Style_Shed                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_AsbShng                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_AsphShn                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_BrkComm                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_BrkFace                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_CBlock                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_CemntBd                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_HdBoard                                  <dbl> 1, 1, 0, 0, 0, 0…
## $ Exterior_1st_ImStucc                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_MetalSd                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_Plywood                                  <dbl> 0, 0, 1, 1, 0, 0…
## $ Exterior_1st_PreCast                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_Stone                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_Stucco                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_1st_VinylSd                                  <dbl> 0, 0, 0, 0, 0, 1…
## $ Exterior_1st_Wd.Sdng                                  <dbl> 0, 0, 0, 0, 1, 0…
## $ Exterior_1st_WdShing                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_AsbShng                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_AsphShn                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_Brk.Cmn                                  <dbl> 0, 0, 1, 0, 0, 0…
## $ Exterior_2nd_BrkFace                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_CBlock                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_CmentBd                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_HdBoard                                  <dbl> 1, 0, 0, 0, 0, 0…
## $ Exterior_2nd_ImStucc                                  <dbl> 0, 1, 0, 0, 0, 0…
## $ Exterior_2nd_MetalSd                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_Other                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_Plywood                                  <dbl> 0, 0, 0, 1, 0, 0…
## $ Exterior_2nd_PreCast                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_Stone                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_Stucco                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Exterior_2nd_VinylSd                                  <dbl> 0, 0, 0, 0, 0, 1…
## $ Exterior_2nd_Wd.Sdng                                  <dbl> 0, 0, 0, 0, 1, 0…
## $ Exterior_2nd_Wd.Shng                                  <dbl> 0, 0, 0, 0, 0, 0…
## $ Mas_Vnr_Type_BrkCmn                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Mas_Vnr_Type_BrkFace                                  <dbl> 1, 1, 0, 0, 0, 0…
## $ Mas_Vnr_Type_CBlock                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Mas_Vnr_Type_None                                     <dbl> 0, 0, 1, 1, 1, 1…
## $ Mas_Vnr_Type_Stone                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Foundation_BrkTil                                     <dbl> 0, 0, 0, 0, 1, 0…
## $ Foundation_CBlock                                     <dbl> 1, 1, 1, 1, 0, 0…
## $ Foundation_PConc                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Foundation_Slab                                       <dbl> 0, 0, 0, 0, 0, 1…
## $ Foundation_Stone                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Foundation_Wood                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Bsmt_Exposure_Av                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Bsmt_Exposure_Gd                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Bsmt_Exposure_Mn                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Bsmt_Exposure_No                                      <dbl> 1, 1, 1, 1, 1, 0…
## $ Bsmt_Exposure_No_Basement                             <dbl> 0, 0, 0, 0, 0, 1…
## $ BsmtFin_Type_1_ALQ                                    <dbl> 0, 0, 0, 1, 0, 0…
## $ BsmtFin_Type_1_BLQ                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ BsmtFin_Type_1_GLQ                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ BsmtFin_Type_1_LwQ                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ BsmtFin_Type_1_No_Basement                            <dbl> 0, 0, 0, 0, 0, 1…
## $ BsmtFin_Type_1_Rec                                    <dbl> 1, 0, 0, 0, 0, 0…
## $ BsmtFin_Type_1_Unf                                    <dbl> 0, 1, 1, 0, 1, 0…
## $ Central_Air_N                                         <dbl> 0, 0, 0, 0, 1, 0…
## $ Central_Air_Y                                         <dbl> 1, 1, 1, 1, 0, 1…
## $ Electrical_FuseA                                      <dbl> 0, 0, 0, 0, 1, 0…
## $ Electrical_FuseF                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Electrical_FuseP                                      <dbl> 0, 0, 0, 0, 0, 0…
## $ Electrical_Mix                                        <dbl> 0, 0, 0, 0, 0, 0…
## $ Electrical_SBrkr                                      <dbl> 1, 1, 1, 1, 0, 1…
## $ Electrical_Unknown                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Type_Attchd                                    <dbl> 0, 0, 1, 1, 0, 1…
## $ Garage_Type_Basment                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Type_BuiltIn                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Type_CarPort                                   <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Type_Detchd                                    <dbl> 1, 1, 0, 0, 1, 0…
## $ Garage_Type_More_Than_Two_Types                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Type_No_Garage                                 <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Finish_Fin                                     <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Finish_No_Garage                               <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Finish_RFn                                     <dbl> 0, 0, 0, 0, 0, 0…
## $ Garage_Finish_Unf                                     <dbl> 1, 1, 1, 1, 1, 1…
## $ Paved_Drive_Dirt_Gravel                               <dbl> 0, 0, 0, 0, 0, 0…
## $ Paved_Drive_Partial_Pavement                          <dbl> 0, 0, 0, 0, 0, 0…
## $ Paved_Drive_Paved                                     <dbl> 1, 1, 1, 1, 1, 1…
## $ Fence_Good_Privacy                                    <dbl> 0, 0, 0, 0, 0, 0…
## $ Fence_Good_Wood                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Fence_Minimum_Privacy                                 <dbl> 0, 0, 0, 1, 0, 0…
## $ Fence_Minimum_Wood_Wire                               <dbl> 0, 0, 0, 0, 0, 0…
## $ Fence_No_Fence                                        <dbl> 1, 1, 1, 0, 1, 1…
## $ Sale_Type_COD                                         <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_Con                                         <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_ConLD                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_ConLI                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_ConLw                                       <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_CWD                                         <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_New                                         <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_Oth                                         <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_VWD                                         <dbl> 0, 0, 0, 0, 0, 0…
## $ Sale_Type_WD.                                         <dbl> 1, 1, 1, 1, 1, 1…

Next, we apply the same resampling method and hyperparameter search grid as we did in Section 2.7. The only difference is when we train our resample models with train(), we supply our blueprint as the first argument and then caret takes care of the rest.

# Specify resampling plan
cv <- trainControl(
  method = "repeatedcv", 
  number = 10, 
  repeats = 5
)

# Construct grid of hyperparameter values
hyper_grid <- expand.grid(k = seq(2, 25, by = 1))

# Tune a knn model using grid search
knn_fit2 <- train(
  blueprint, 
  data = ames_train, 
  method = "knn", 
  trControl = cv, 
  tuneGrid = hyper_grid,
  metric = "RMSE"
)
# print model results
knn_fit2
## k-Nearest Neighbors 
## 
## 2049 samples
##   80 predictor
## 
## Recipe steps: nzv, integer, normalize, dummy 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1844, 1845, 1843, 1844, 1844, 1845, ... 
## Resampling results across tuning parameters:
## 
##   k   RMSE      Rsquared   MAE     
##    2  35718.34  0.8083251  22732.57
##    3  34931.41  0.8187706  21941.28
##    4  34794.68  0.8230095  21713.76
##    5  34607.51  0.8275359  21484.67
##    6  34395.72  0.8319064  21252.76
##    7  34139.63  0.8356526  21067.31
##    8  33929.29  0.8394225  20957.91
##    9  33733.30  0.8426560  20865.28
##   10  33719.86  0.8432838  20868.48
##   11  33833.91  0.8431627  20917.45
##   12  33894.13  0.8437289  21003.84
##   13  33998.35  0.8432144  21105.27
##   14  34142.54  0.8427766  21195.34
##   15  34208.80  0.8428515  21256.55
##   16  34306.03  0.8422919  21308.34
##   17  34428.02  0.8417635  21386.85
##   18  34572.65  0.8406348  21454.44
##   19  34611.91  0.8407819  21510.62
##   20  34700.15  0.8404275  21572.50
##   21  34833.19  0.8395586  21635.17
##   22  34822.01  0.8402366  21656.58
##   23  34904.71  0.8399106  21702.15
##   24  35003.25  0.8394481  21769.57
##   25  35061.56  0.8392900  21788.96
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 10.
# plot cross validation results
ggplot(knn_fit2)

Looking at our results we see that the best model was associated with k = 10, which resulted in a cross-validated RMSE of 33,400.