Why is it important for a correlation matrix to be positive semidefinite?

Question

Why is it important for a correlation matrix to be positive semidefinite?

1.8k Views Asked by Bumbble Comm At 29 Mar 2026 - 11:43

Now I understand the definition of positive semidefiniteness but I am struggling to understand as to why a Correlation matrix must be positive semidefinite. What are the effects of negative eigenvalues in relation to correlation matrices?

Original Q&A

There are 2 best solutions below

user21982 On 03 Aug 2020 - 4:40

The fact that a correlation matrix is positive-semidefinite (p.s.d.) is a property, not a desired attribute. Note that this is a theoretical fact, some algorithms may generate matrices with negative eigenvalues due to computational error and floating-point error.

Now let's take this discussion one step back so we can understand it better after. The correlation matrix is part of a decompositon of the covariance matrix as shown below $$ \Sigma = \operatorname{diag}(\sigma) C \operatorname{diag}(\sigma) $$ where $\operatorname{diag}(\sigma)$ is a diagonal matrix with the standard deviations as it's entries. Also notice that a p.s.d. matrix satisfy the following:

$$\forall x \in \mathbb{R}^{n}, \; x^\top\Sigma x \geq 0$$

A useful fact about p.s.d. matrices that comes from the above statement is that there diagonal elements $\sigma_{ii}$ are greater or equal to $0$. Just choose $x_j=0$ if $j \not = i$ and $x_j=1$ if $j = i$, this gives $\sigma_{ii}\geq 0$.

So, we know the correlation matrix is p.s.d.. What does that mean?

Notice that if $C$ isn't p.s.d. you may get negative variances in the diagonal as pointed out by user @leonbloy in the comments. Notice that $C$ p.s.d implies $\Sigma$ p.s.d becase for every non-singular matrix $P$, transformations of the form $P\Sigma P$ are p.s.d.

Other interesting property is that if $\Sigma$ is p.s.d. then $\Sigma^{-1}$ is p.s.d. . Which leads us to another problematic fact, we can't guarantee the following expression is positive neither convex $$ (y-\mu)^\top\Sigma^{-1}(y-\mu) $$ This is a commonly encountered expression in statistics known as mahalanobis distance and it's also encountered in the multivariate gaussian distribution. Let $X$ be a $MVN(\boldsymbol\mu, \boldsymbol\Sigma)$ then it's density is given by

$$f_{\mathbf X}(x_1,\ldots,x_k) = \frac{\exp\left(-\frac 1 2 ({\mathbf x}-{\boldsymbol\mu})^\mathrm{T}{\boldsymbol\Sigma}^{-1}({\mathbf x}-{\boldsymbol\mu})\right)}{\sqrt{(2\pi)^k|\boldsymbol\Sigma|}}$$

Notice that the best case would be positive-definite $C$ because p.s.d. lets one or more variables be constant (A $0$ in the diagonal means a degenerate dimension and thus it's better to work in $N-1$ dimensions) or a linear combination of others and the $MVN$ pdf also becomes undefined.

**anonymous** · Accepted Answer

Note that I've written this answer in terms of covariance matrices, but a correlation matrix is simply a scaled covariance matrix where

$\rho_{i,j}=\frac{C_{i,j}}{\sqrt{C_{i,i}} \sqrt{C_{j,j}}}$.

It's relatively easy to show that $C$ is positive semidefinite if and only if $\rho$ is positive semidefinite.

This answer is written in two parts because the term "covariance matrix" is sometimes used for the sample covariance matrix of a data set and other times for the theoretical covariance matrix of jointly distributed random variables.

First, the sample covariance.

Given a sequence of data vectors $x^{(1)}$, $x^{(2)}$, $\ldots$, $x^{(n)}$, Let $X$ be the matrix whose columns are $x^{(1)}$, $x^{(2)}$, $\ldots$, $x^{(n)}$.

$X=\left[ x^{(1)}, x^{(2)}, \ldots x^{(n)} \right] $.

The sample mean is

$\bar{x}=\frac{\sum_{i=1}^{n} x^{(i)}}{n}$.

It's convenient to define

$\bar{X}=\bar{x}\left[1, 1, \ldots, 1 \right]$

to get a matrix whose columns are each $\bar{x}$.

The sample covariance matrix is

$C=\frac{1}{n-1} \left( XX^{T} - \bar{X}\bar{X}^{T} \right)$.

To see that this matrix is positive semidefinite, take any vector $z$ in $R^{n}$. Then

$z^{T}Cz=\frac{1}{n-1} (z^{T}(X-\bar{X}))^{T}((X-\bar{X})z$.

$z^{T}Cz=\frac{1}{n-1} ((X-\bar{X})z)^{T}((X-\bar{X})z)=\frac{1}{n-1}\| (X-\bar{X})z \|_{2}^{2} \geq 0$.

Thus $C$ is positive semidefinite.

Note that, in practice, particularly if $n$ is smaller than the dimension of the vectors $x$, $C$ can be numerically singular and might have what appear to be small negative eigenvalues.

Next, the covariance matrix of a jointly distributed vector of random variables.

Suppose that the random variables $X_{1}$, $X_{2}$, $\ldots$, $X_{n}$ are jointly distributed. We'll assume that $E[X]=0$, but it's an easy exercise to extend this to a nonzero mean $\mu$. Then

$\mbox{Cov}(X)=C=E[XX^{T}]$.

To show that $C$ is positive semidefinite, take any vector $z$ in $R^{n}$, and consider the random variable

$W=z^{T}X=z_{1}X_{1}+z_{2}X_{2}+\ldots+z_{n}X_{n}$.

Since $E[X^{(i)}]=0$, $E[W]=0$, and

$\mbox{Var}(W)=E[W^{2}]$.

$\mbox{Var}(W)=E[(z^{T}X)^{2}]=E[z^{T}XX^{T}z]$.

$\mbox{Var}(W)=z^{T}Cz$.

But the variance of every random variable is greater than or equal to 0, so

$\mbox{Var}(W)=z^{T}Cz \geq 0$.

In practice, we usually try to avoid working with covariance matrices that are positive semidefinite but singular and prefer to work with covariance matrices that are actually positive definite.

Why is it important for a correlation matrix to be positive semidefinite?

There are 2 best solutions below

Related Questions in MATRICES

Related Questions in STATISTICS

Related Questions in CORRELATION

Related Questions in POSITIVE-DEFINITE

Related Questions in POSITIVE-SEMIDEFINITE

Trending Questions

Popular # Hahtags

Popular Questions