Connection between PCA and linear regression

14.5k Views Asked by At

Is there a formal link between linear regression and PCA? The goal of PCA is to decompose a matrix into a linear combination of variables that contain most of the information in the matrix. Suppose for sake of argument that we're doing PCA on an input matrix rather than its covariance matrix, and the columns $X_1, X2, ..., X_n$ of the matrix are variables of interest. Then intuitively it seems that the PCA procedure is similar to a linear regression where one uses a linear combination of the variables to predict the entries in the matrix. Is this correct thinking? How can it be made mathematically precise?

Imagine enumerating the (infinite) space of all linear combinations of the variables $X_1, X_2, ...,X_n$ of a matrix of data and doing linear regression on each such combination to measure how much of the rows of the matrix the combination can 'explain'. Is there an interpretation of what PCA doing in terms of this operation? I.e. how in this procedure PCA would select the 'best' linear combinations? I realize this procedure is obviously not computationally feasible, I only present it to try to make the link between PCA and linear regression. This procedure works directly with linear combinations of columns of a matrix so it does not require them to be orthogonal.