Data leakage

Data leakage is when information from outside the training data set is used to create the model. Data leakage often occurs during the data preprocessing period.

Figure 3.10: Performing feature engineering preprocessing within each resample helps to minimize data leakage.