Does the MLE estimator of the covariance of a multivariate Gaussian have a non-zero determinant?

267 Views Asked by At

The MLE estimator of the covariance of a multivariate Gaussian is

$$ \mathbf{\Sigma} = \frac{1}{N} \sum_{n=1}^N \left( (\mathbf{x}_n-\boldsymbol{\mu}) (\mathbf{x}_n-\boldsymbol{\mu})^\intercal \right). $$

Can we prove that its determinant is non-zero?

1

There are 1 best solutions below

0
On BEST ANSWER

Determinant of $\Sigma$ may be zero.

Let $k$ be the dimension of the space to which the sample vectors $x_n$ belong. If $N < k$ then $\Sigma$ is not full rank and therefore $\det \Sigma = 0$.

Moreover, even when $N \ge k$ it is still possible that $\det \Sigma = 0$. For a simple example, suppose that $k=2$ and that the sample vectors $x_n \in\mathbb{R}^2$ originate from the distribution $(X, X)$ where $X \sim \mathcal{N}(0, 1)$, i.e. the two components are perfectly correlated. In this case $x_n = (a_n, a_n) = a_n (1, 1)^T$ for some real numbers $a_n$ for all $n=1,\dots,N$. Therefore,

$$ \det \Sigma = \det \left(\frac{1}{N} \sum_{n=1}^N x_n x_n^T\right) = \det \left(\frac{1}{N} \sum_{n=1}^N a_n^2 \begin{pmatrix}1 & 1 \\ 1 & 1\end{pmatrix} \right) = \det \begin{pmatrix}a & a \\ a & a\end{pmatrix} = 0 $$

where $a = \frac{1}{N} \sum_{n=1}^N a_n^2$.


On the other hand, note that the set

$$ \mathcal{X}_0 = \{(x_1, x_2, \dots, x_N)\,|\, \det \Sigma(x_1, x_2, \dots, x_N) = 0\} $$

of $N$-tuples of samples $x_n$ for which $\det \Sigma = 0$ has dimension one less than the dimension of the set of all $N$-tuples of samples $x_n$. Therefore, if $N \ge k$ and $\det \Sigma' > 0$ where $\Sigma'$ denotes the covariance matrix of the underlying distribution, then the set $\mathcal{X}_0$ has measure zero and consequently $\det \Sigma > 0$ with probability $1$.


Remark about practical applications: Note that in a practical application one would probably not care about strict equality $\det \Sigma = 0$, but about whether $\Sigma$ is nearly singular (in fact one may care about a stronger condition that $\Sigma$ is well-conditioned).

By the multi-dimensional variant of the Central Limit Theorem as $N\to \infty$, the sample covariance $\Sigma$ converges to $\Sigma'$. Therefore, if the eigenvalues of $\Sigma'$ are large then for large $N$ it becomes increasingly unlikely that $\det \Sigma$ is near zero.

On the other hand, if $\Sigma'$ does have small eigenvalues then one can use Principal Component Analysis to identify a subspace $V$ on which the eigenvalues of the restriction $\Sigma'|_V$ are large and therefore $\det \Sigma|_V > 0$ with high probability.


Lower bounds: There are lower bounds on the eigenvalues of $\Sigma$. For example, for isotropic distributions, i.e. ones with $\Sigma' = \alpha I$ lower bounds on the smallest eigenvalue of $\Sigma$ are derived in this paper.