PCA
Key idea is of PCA is eigendecomposition of the covariance matrix
- Find eignenvectors and eigenvalues.
- Sort eigenvalues from largest to smallest.
- Truncate at some number of eigenvalues \(p < N\) (and their corresponding eigenvalues)
- Project the predictor vectors (\(\mathbf{X}\)) onto this smaller basis
\[ \mathbf{\alpha^{(n)}} = \mathbf{U_p^Tx^{(n)}} \]
where \(\mathbf{U_p}\) is the Nxp matrix of eigenvectors in the smaller basis.
This allows a compression of the data into a smaller set of predictors.
Note there are some limitations of PCA:
PCA fails when raw data is not orthogonal
Basis vectors returned by PCA are not interpretable
PCA does not return the most influential component (it doesn’t depend at all on the
response
variable.)