Possible application of Laws of Large Numbers and Central Limit Theorem to SVD of dual covariance matrices

60 Views Asked by At

Let $X:=[x_1\dots x_n] \in \mathbb{R}^{d\times n}, x_i \in \mathbb{R}^{d\times 1}$ be a data matrix where $x_i \in \mathbb{R}^{d\times 1}$ are iid random vectors with mean $\mu$ and covariance $\Sigma$. I'm wondering if there're any applications of strong law of large numbers and central limit theorem ($n\to \infty, d$ stays fixed) on the covariance $C:= \frac{1}{d}\sum_{i=1}^{n}x_i x_i' = \frac{1}{d}XX'$ and the dual covariance $S:= \frac{1}{n}X'X, S_{ij}= \frac{1}{n} x_i'x_j$ that connects the covariance and the dual covariance matrices to respectively their eigenvalue decomposition/SVD's?

The reason I think such kind of SLLN or CLT might exist is that (1) $x_i$'s are iid, and (2) taking square root of covariance or dual covariance (see the definition of $\hat{X}_{d}$ below) gives us sort of a reconstruction of the original data matrix $X$.

To be more precise, let $S=\sum_{i=1}^{n}\lambda_i u_i u_i' = U\Lambda U', U:=[u_1\dots u_n], \lambda_1 \ge \dots \ge \lambda_n \ge 0$ be the SVD of the dual covariance matrix $S$. Since $rank(S) \le d$, we can also write: $S=\sum_{i=1}^{d}\lambda_i u_i u_i'$. Let us take the $d \times n$ matrix $ \hat{X}_d := \Lambda_d U_d'$, where $\Lambda_d := diag(\lambda_1 \dots \lambda_d), U_d=[u_1 \dots u_d] \in \mathbb{R}^{n \times d}$. Then, $\hat{X}_d = \sum_{i=1}^{d}\sqrt \lambda_i e_i u_i'$ , where $e_i, 1 \le i \le d,$ form the canoical basis for $\mathbb{R}^d$. Note that: $\hat{X}_d' \hat{X}_d= \sum_{i=1}^{d}\lambda_i u_iu_i'= X'X $. So it seems: $\hat{X}_d$ is a $d$-dimensional reconstruction of $X$(?)

QUESTION: Then can we say that the $i$-th column of $\hat{X}_d$, which is given by:

\begin{align} (\hat{X}_{d})_i &= \begin{bmatrix} \sqrt\lambda_1 u_{i1} \\ \sqrt\lambda_2 u_{i2} \\ \vdots \\ \sqrt\lambda_d u_{id} \end{bmatrix} \end{align}

satisfies $(\hat{X}_{d})_i - x_i \to 0$ in almost surely, or that $\sqrt n ( (\hat{X}_{d})_i - x_i )\to \mathcal{N}$, where $\mathcal{N}$ is an appropriate normal distribution.