Is there a version of the central limit theorem for the sample covariance matrix?
Suppose $\{X_1,\cdots,X_n,\cdots\}$ be a sequence of i.i.d length-$p$ random vectors. Suppose their mean is zero vector and true covariance matrix is $p\times p$ positive definite matrix $V$. Suppose the sample covariance matrix is $$S:=\frac{1}{n}\sum_{i=1}^nX_iX_i^T$$
Here are things that I know:
- If $X_i$'s are normal, and for fixed $n$, $S$ follows a Wishart distribution, i.e. $nS\sim\mathcal{W}_p(n,V)$.
- If $p=1$, then CLT for sample variance $s^2$ is available.
Is there a central limit theorem that can be applied for random matrix $S$? Let's say all "necessary assumptions" (e.g. finite fourth moment) are made.
By the Law of Large Numbers, we already know that $$\frac{1}{n}\sum_j{X_jX_j^{\mathsf{T}}}\overset{p}{\to}\mathbb{E}[X_jX_j^{\mathsf{T}}]=V$$ We can consider matrices as elements of the vector space $\mathbb{R}^n\otimes(\mathbb{R}^n)^*$; then the multivariate central limit theorem applies. Said theorem tells us $$\frac{1}{\sqrt{n}}\sum_j{(X_jX_j^{\mathsf{T}}-V)}\overset{\mathcal{D}}{\to}\mathcal{N}(0,\Theta)$$ where $\Theta$ is the covariance matrix of the "vector" $X_jX_j^{\mathsf{T}}-V$; that is, the correlation between the $(k,l)$th and the $(m,n)$th element of that "vector" is \begin{align*} (e_k\otimes e_l^{\mathsf{T}})\Theta(e_m\otimes e_n^{\mathsf{T}})&=\mathbb{E}[(e_k\otimes e_l^{\mathsf{T}})(X_jX_j^{\mathsf{T}}-V)\cdot(e_m\otimes e_n^{\mathsf{T}})(X_jX_j^{\mathsf{T}}-V)] \\ &=\mathbb{E}[e_k^{\mathsf{T}}(X_jX_j^{\mathsf{T}}-V)e_l\cdot e_m^{\mathsf{T}}(X_jX_j^{\mathsf{T}}-V)e_n] \\ &=\mathbb{E}[((X_j)_k(X_j)_l-V_{kl})((X_j)_m(X_j)_n-V_{mn})] \end{align*} Further simplification is not possible without knowing more about the distribution of $X_j$.
If you are willing to abuse notation, the upshot is that $$\frac{1}{n}\sum_j{X_jX_j^{\mathsf{T}}}\overset{\mathcal{D}}{\to}\mathcal{N}(V,\sqrt{n}\Theta)$$