Log determinant of random Hermitian matrix

88 Views Asked by At

I have a real-valued $N\times M$ matrix $X$ where $N \gg M$. All the elements in a given row $i$ of $X$ are drawn iid from a non-Gaussian distribution; the mean and variance of the distribution are different for different rows of $X$, i.e. each row of $X$ is drawn from some distribution specific to that row. Consequently, while $N$ is fixed $M$ is not -- we could use a larger or smaller value for $M$ by simply drawing more values from the distribution corresponding to each row of $X$.

I'm interested in calculating the expected value of the log determinant $$ \log |X^TX + \lambda I| $$

where $\lambda$ is some small constant added for numerical stability to ensure that $X^TX + \lambda I$ is always positive definite. Clearly, $X^TX$ is Hermitian but does not follow a Wishart distribution since the entries of $X$ are not drawn from a Gaussian. Let's say I use a Monte Carlo method and generate multiple draws for $X$ with a given value of $M$, then calculate the log determinant for each draw (e.g. using a Cholesky decomposition or SVD), then in this way calculate the expected log determinant for a given value of $M$.

Is there any relationship between the expected value of the log determinant and the value of $M$? As I increase the value of $M$ up to $N$, what happens to the expected value of the log determinant? Would I need to estimate it separately for each value of $M$, or is there some way to determine how the number of columns in $X$ (given the assumptions above) relates to the value of the log determinant?

1

There are 1 best solutions below

0
On BEST ANSWER

Let for all integer $1 \leqslant i \leqslant M$, $r_i$ the $L^2$ random variable with value in $\mathbb{R}^N$ that determines the $i$th row of $X$. Let us decompose $X$ as $X = \begin{pmatrix} Y & r_M \end{pmatrix}$ with $Y$ a random $N \times (M - 1)$ matrix. Then, \begin{align*} \det(X^TX + \lambda I_M) & = \begin{vmatrix} Y^TY + \lambda I_{M - 1} & Y^Tr_M \\ r_M^TY & r_M^Tr_M + \lambda \end{vmatrix}\\ & = \det(Y^TY + \lambda I_{M - 1})\det(r_M^Tr_M + \lambda - r_M^TY(Y^TY + \lambda I_{M - 1})^{-1}Y^Tr_M)\\ &\ \ \ \ \textrm{ by the block determinant formula},\\ & = (r_M^T(I_{M - 1} - Y(Y^TY + \lambda I_{M - 1})^{-1}Y^T)r_M + \lambda)\det(Y^TY + \lambda I_{M - 1}). \end{align*} Therefore, $$ \ln(\det(X^TX + \lambda I_M)) = \ln(\det(Y^TY + \lambda I_{M - 1})) + \ln(r_M^TSr_M + \lambda), $$ where $S = I_{M - 1} - Y(Y^TY + \lambda I_{M - 1})^{-1}Y^T$. Notice that by mutual independence of the $r_i$, $S$ and $r_M$ are independant. Computing the expectancy of $\ln(\det(Y^TY + \lambda I_{M - 1}))$ is the same problem with a lower value of $M$. We only have left to compute the expectancy of $\ln(r_M^TSr_M + \lambda)$.

$S$ is symmetric and for eigenvalue $\mu$ of $S$, there is a non zero $x$ such that $Sx = \mu x$ hence $x - Y(Y^TY + \lambda I_{M - 1})^{-1}Y^Tx = \mu x$. After we multiply on the left by $Y^T$, we obtain $-Y^TY(Y^TY + \lambda I_{M - 1})^{-1}y = (\mu - 1)y$ where $y = Y^Tx$. Notice that $Y^TY = Y^TY + \lambda I_{M - 1} - \lambda I_{M - 1}$ so, $$ -(I_{M - 1} - \lambda (Y^TY + \lambda I_{M - 1})^{-1})y = (\mu - 1)y $$ i.e., $$ \lambda(Y^TY + \lambda I_{M - 1})^{-1}y = \mu y. $$ We have $Y^TY + \lambda I_{M - 1} \geqslant \lambda I_{M - 1}$ so $0 < (Y^TY + \lambda I_{M - 1})^{-1} \leqslant \frac{1}{\lambda}I_{M - 1}$ hence $0 < \lambda(Y^TY + \lambda I_{M - 1})^{-1} \leqslant I_{M - 1}$. It implies that $0 < \mu \leqslant 1$. In particular, $S$ is definite positive with eigenvalues less or equal than $1$. It means that $x \mapsto x^TSx$ is a scalar product and $\ln(\lambda) \leqslant \ln(r_M^TSr_M + \lambda) \leqslant \ln(r_M^Tr_M + \lambda)$.

From this, if we call $\sigma_{i,\lambda}$ the expectancy of $\ln(r_i^Tr_i + \lambda) = \ln(\|r_i\|^2 + \lambda)$, the expencancy $E_{M,\lambda}$ of $\ln(\det(X^TX + \lambda I_M))$ verifies, $$ \ln(\lambda) \leqslant E_{M,\lambda} - E_{M - 1,\lambda} \leqslant \sigma_{M,\lambda}, $$ thus, using the fact that $E_{0,\lambda} = 0$, $$ M\ln(\lambda) \leqslant E_{M,\lambda} \leqslant \sum_{i = 1}^M \sigma_{i,\lambda}. $$ In particular, in the case where the $r_i$ have a norm less or equal than $\sqrt{1 - \lambda}$ with probability $1$, $E_{M,\lambda}$ decreases when $M$ increases with probability $1$.