3.8 Proper implementation

We stated at the beginning of this chapter that we should think of feature engineering as creating a blueprint rather than manually performing each task individually. This helps us in two ways: (1) thinking sequentially and (2) to apply appropriately within the resampling process.

While your project’s needs may vary, here is a suggested order of potential steps that should work for most problems:

  1. Filter out zero or near-zero variance features.

  2. Perform imputation if required.

  3. Normalize to resolve numeric feature skewness.

  4. Standardize (center and scale) numeric features.

  5. Perform dimension reduction (e.g., PCA) on numeric features.

  6. One-hot or dummy encode categorical features.

Also, refer to tidymodels recipes - ordering of steps (https://recipes.tidymodels.org/articles/Ordering.html)