9.5 Exploiting Correlation
The previous two chapter were about reducing the noise , which allows us to make better predictions. However, this does not help us deal with the correlation between features -in this case the correlation between wavelengths within a spectra
One approach to dealing with correlation is principal component analysis (PCA), which gets rid of the correlation between features, but isn’t guaranteed to improve the predictive power because the output isn’t taken into account.
Figure 9.8: PCA dimension reduction applied across all small-scale data. (a) A scree plot of the cumulative variability explained across components. (b) Scatterplots of Glucose and the first three principal components.
Using PCA, the 80% of the variation can be explained by using just 11 components, seen in the scree plot above. That being said, each component does not contain contain much predictive power to the response. For this reason, partial least squares should have been used. (Not sure why they didn’t?)
A second approach to deal with the correlation is to take the first-order derivative within each profile. To compute first-order differentiation, the response at the \((p−1)^{st}\) value in the profile is subtracted from the response at the \(p^{th}\) value in the profile. This difference represents the rate of change of the response between consecutive measurements in the profile. Larger changes correspond to a larger movement and could potentially be related to the signal in the response. This means that the autocorrelation between profiles should be greatly reduced.
We can see in Figure 9.9 that using the first order derivatives dramatically takes out autocorrelation after the 3rd day lag.