Frobenius norm of a product of Gaussian matrices

301 Views Asked by At

Suppose $$C_n=X_1 X_2\cdots X_n$$ where $X_i$ is $d\times d$ matrix with IID entries sampled with normal centered at 0 and variance $1/d$.

The following appears to be true for large $d$, why?

$$\|C_n C_n^T\|_F^2\approx d(n+1)$$

Here are some numbers from a 4 samples with $d=1000$

$$ \begin{array}{c|ccccc} & \text{n} & \text{sample1} & \text{sample2} & \text{sample3} & \text{sample4} \\ \hline & 1 & 2003.99 & 1998.66 & 1999.51 & 1998.14 \\ & 2 & 3029.97 & 2990.12 & 3008.21 & 2999.13 \\ & 3 & 3967.81 & 3995.46 & 4022.33 & 4005.2 \\ & 4 & 5027.41 & 5075.39 & 4941.94 & 5057.4 \\ & 5 & 6143.21 & 5964.35 & 5844.76 & 6015.08 \\ \end{array} $$

Notebook

It also appears to hold if I use the same matrix for all $X_i$, ie, $$\|X^n (X^T)^n\|^2_F \approx d(n+1)$$

Dividing entries of table by $d$ below we get near perfect agreement in table below with $d=4000$ using either $X^n$ (fixed) or $C_n$ (resampled)

enter image description here

(crossposted on stats.SE)

2

There are 2 best solutions below

0
On

I was working on something similar to the (seemingly correct) answer you got on stats.SE. My approach was basically identical, with the difference that I was using the fact that

$$ ||M||^2_{F} = Tr(MM^T) $$

But the calculations are even more tedious in that case.


  • EDIT: I think that combining this property of the Frobenius norm with the answer you got to this question you should be able to obtain the result.
0
On

First I will give a rigorous proof for the case $n=1$.

For convenience $\|C_n^\mathsf TC_n\|^2_F$ instead of $\|C_nC_n^\mathsf T\|_F^2$ will be calculated. This is obviously equivalent. Suppose $\mathrm X_1=(\mathbf x_1,\dots,\mathbf x_d)$, then $\mathbf x_i\sim N_d(\mathbf 0,\frac1dI_d)$. The element of $\mathrm{X}_1^\mathsf T\mathrm X_1$ in $(i,j)$ is $\mathbf x_i^\mathsf T\mathbf x_j$.

For $i=j$, $$ \mathbb E[(\mathbf x_i^\mathsf T\mathbf x_i)^2]=\frac1{d^2}\mathbb E[((\sqrt d\mathbf x_i)^\mathsf T(\sqrt d\mathbf x_i))^2]=\frac1{d^2}(2d+d^2)=1+\frac2d. $$ For $i\neq j$, $$ \begin{aligned}\mathbb E[(\mathbf x_i^\mathsf T\mathbf x_j)^2]&=\mathbb E[\mathrm{tr}(\mathbf x_j\mathbf x_j^\mathsf T\mathbf x_i\mathbf x_i^\mathsf T)]=\mathrm{tr}(\mathbb E[\mathbf x_j\mathbf x_j^\mathsf T]\mathbb E[\mathbf x_i\mathbf x_i^\mathsf T])\\&=\mathrm{tr}\Big(\frac1{d^2}I_d\Big)=\frac1d.\end{aligned} $$

Hence $$ \mathbb E[\|\mathrm X_1^\mathsf T\mathrm X_1\|_{\mathrm F}^2]=d\Big(1+\frac2d\Big)+(d^2-d)\frac1d=2d+1. $$


For the case $n>1$, here only some approximation is given. Note that if $n$ is large the approximation may be bad since the approximation of the case $n+1$ is based on that of $n$.

For simplicity only $\|\mathrm C_2^\mathsf T\mathrm C_2\|_\mathrm F^2$ will be calculated based on $\mathrm C_1=\mathrm X_2$. The $(i,j)$-th element of $\mathrm C_2^\mathsf T\mathrm C_2$ is $\mathbf x_i^\mathsf T\mathrm C_1^\mathsf T\mathrm C_1\mathbf x_j$. Denote the $(p,q)$-th element of $\mathrm C_1^\mathsf T\mathrm C_1$ as $\mathrm c_{pq}$, then the $\mathbf x_i^\mathsf T\mathrm C_1^\mathsf T\mathrm C_1\mathbf x_j=\sum_{p,q}\mathrm x_{ip}\mathrm c_{pq}\mathrm x_{jq}$. For $i=j$, it is the diagonal element, and its approximation (by taking expectation) is $$ \mathbb E[\mathrm {tr}(\mathbf x_i^\mathsf T\mathrm C_1^\mathsf T\mathrm C_1\mathbf x_i)]=\mathrm{tr}(\mathbb E[\mathbf x_i\mathbf x_i^\mathsf T]\mathbb E[\mathrm C_1^\mathsf T\mathrm C_1])=\frac1d\mathbb E[\mathrm {tr}(\mathrm C_1^\mathsf T\mathrm C_1)]=1. $$

Now we consider the non-diagonal element, i.e., $i\neq j$. Taking square and expectation, we obtain $$ \begin{aligned}\mathbb E\Big(\sum_{p,q}\mathrm x_{ip}\mathrm c_{pq}\mathrm x_{jq}\Big)^2&=\mathbb E\Big[\sum_{p,q,r,s}\mathrm x_{ip}\mathrm x_{jq}\mathrm x_{ir}\mathrm x_{js}\mathrm c_{pq}\mathrm c_{rs} \Big]\\&\stackrel{(?)}=\sum_{p,q}\mathbb E[(\mathrm x_{ip}\mathrm x_{jq})^2]\mathbb E[\mathrm c_{pq}^2]\\&=\sum_{p=1}^d\mathbb E[(\mathrm x_{ip}\mathrm x_{jp})^2]\mathbb E[\mathrm c_{pp}^2]+\sum_{p\neq q}\mathbb E[(\mathrm x_{ip}\mathrm x_{jq})^2]\mathbb E[\mathrm c_{pq}^2]\\&\approx d\cdot\frac1{d^2}\cdot1+(d^2-d)\cdot\frac1{d^2}\cdot\frac1d\\&\approx\frac2d.\end{aligned} $$ Note I haven't check the second equality carefully. Thus $\|\mathrm C_2^\mathsf T\mathrm C_2\|_\mathrm F^2\approx d+(d^2-d)\frac2d\approx 3d$.

The approximation is based on: The diagonal elements of $\mathrm C^\mathsf T\mathrm C$ are about $1$, non-diagonal elements have zero mean but have $n/d$ squared expectation. And the above analysis indicates that these remains true as $n$ increases. Thus, for any $n$, the squared norm has approximation $(n+1)d$.