I'm trying to perform canonical correlation analysis (CCA) between matrices $X$ ($n \times p$) and $Y$ ($n \times k$), with covariance matrices $S_{X}=XX^T/(n-1)$ and $S_{Y}=YY^T/(n-1)$ respectively, and $n < p, n< k$ (fat matrices).
The first step in CCA is computing the inverse of the square roots of the covariances, i.e., $S_{X}^{-1/2}$ and $S_{Y}^{-1/2}$. I was going to do this via eigendecomposition of $S_{X}$, since $S_{X}^{-1/2} = U D^{-1/2} U^T$, and similarly for $S_{Y}$. I'm using a rank $d$ decomposition ($S_X^{-1/2} \approx U_d D^{-1/2}_d U_d^T$) where $d < n$, rather than full SVD because it's (a) faster and (b) I'm assuming that the data is low rank and hence no need to model the full eigen spectrum (many eigenvalues reflect random noise not signal).
This seems to me equivalent to assuming that the last $n-d$ eigenvalues are zero, and thus I can ignore the last $n-d$ eigenvectors. But would it be better (in terms of approximating the full-rank decomposition) to first add a small diagonal element to the covariance, i.e., $S_{X}+\lambda I_n$, thus assuming that the remaining eigenvalues are $\lambda$? Where would I get an extra $n-d$ orthogonal vectors to act as the eigenvectors? I suppose I could do QR but then I'm trying to avoid the cost of doing that on a large matrix (the main reason for using low rank eigendecomposition in the first place).