9.4 Reducing Other Noise
There can also be systemic variation between bioreactor measurement instruments rather than difference in the amount of molecules being sampled. You saw in the graph comparing small and large bioreactors that although the peaks were the same, the intensity appeared higher for large ones. Given that these peaks are most important, we can rescale the data across reactors to ensure results are comparable.
The author uses standard normal variate (SNV) from spectroscopy literature to rescale the data based on the mean and standard deviation. Since these measures can be influenced by outliers, trimming is used to take out extreme values.
Figure 9.6 compares the profiles of the spectrum with the lowest variation and highest variation for (a) the baseline corrected data across all days and small-scale bioreactors. This figure demonstrates that the amplitudes of the profiles can vary greatly.
Another source of noise here was the intensity measurements for EACH wavelength within a spectrum. The author uses splines and moving averages to smooth out the intensities. The key here was to select the correct number of points to include in the moving averages.