Why is the expectation of the inverse of product of random matrices with entries i.i.d. Gaussian the identity matrix multiplied by a constant?

102 Views Asked by At

I'm reading Stanford's EE270 lecture notes on Large Scale Matrix Computation and Optimization found at: https://web.stanford.edu/class/ee270/scribes/lecture8.pdf

In Section 8.7, there is the expression

$$\mathbb{E}[((U^TS^T)(SU))^{-1}] = I \cdot \mathrm{constant}$$

From context, given a matrix $A_{n \times d}$, the compact SVD of $A$ is taken to get $A = U\Sigma V^T$. $S$ is a $S_{m \times n}$ a matrix with entries i.i.d. $\mathcal{N}(0,\frac{1}{\sqrt{m}})$, and $I_{k \times k}$ the identity matrix.

The notes state that the constant is equal to $\frac{m}{m-d-1}$ which is the expected value of a $\chi^2$ random variable from random matrix theory.

My question is: how can I show this / are there any references for this? Thank you!

1

There are 1 best solutions below

1
On

You have a small typo: the entries are $\frac{1}{\sqrt{m}} \mathcal{N}(0, 1) = \mathcal{N}(0, \frac{1}{m})$.


Because $U$ are non-random, they can be pulled out of the expectation. So it suffices to show $E[(S^\top S)^{-1}] = \frac{m}{m-d-1} I$.

$S^\top S$ follows a Wishart distribution with scale matrix $\frac{1}{m} I_{n \times n}$ and $m$ degrees of freedom. Then $(S^\top S)^{-1}$ follows an inverse-Wishart distribution with scale matrix $mI_{n \times n}$ and $m$ degrees of freedom. Adapting the formula for the mean from the Wikipedia page, we have $$E[(S^\top S)^{-1}] = \frac{mI_{n \times n}}{m - n - 1}$$ which gives the expression in your notes.

Some references for the derivation of the mean of the inverse-Wishart distribution: