Efficient methods to compute PCA for an incrementally growing matrix

49 Views Asked by At

I am wondering if there exists an efficient method to compute PCA. I am drafting the question into a presudo code:

matrix = []
pca = [] * 9999999
for i in range (0, 9999999):
    matrix = add_one_new_row(matrix)
    pca[i].append(do_pca(matrix))

I will need to calculate PCA each time for an incrementally growing matrix. While the above method would give me the PCA each time, I am just wondering if there exists a more efficient algorithm on this, since when calculating PCA for nth time, I have all the previously calculated PCA already.

1

There are 1 best solutions below

2
On BEST ANSWER

The PCA components are eigenvectors of $X^TX$. Adding a row to $X$ corresponds to a rank 1 update of $X^TX$: $$\begin{pmatrix}X \\ v^T\end{pmatrix}^T \begin{pmatrix}X \\ v^T\end{pmatrix} = X^TX + vv^T.$$ You can use the Bunch–Nielsen–Sorensen formula to get the eigenvalues of $X^TX + vv^T$ based on the eigenvalues of $X^TX$. To apply the formula, you first have to perform a linear transformation to $X^TX$ to make it diagonal (full details in this publication), which may null the speed gains. An alternative formula is presented in this paper.