Can a single random variable have a covariance matrix?

229 Views Asked by At

This link about the closed-form analytical solution for the Wasserstein distance says:

Let $\mu_1\in \mathbb{R}^N$ be a normally distributed random variable with expected value $m_1$ and covariance matrix $C_1$.

I never knew there could be a covariance matrix for univariate data, only multivariate data. How is it possible for a single random variable to have a covariance matrix?

3

There are 3 best solutions below

6
On

The random variable $\mu_1$ is in $\mathbb{R}^N$, so it's an $N$ dimensional vector. The covariance matrix for it would be the covariance matrix between each of the components of $\mu_1$.

0
On

A covariance matrix is something which exists for any random vector, where a random vector is nothing more than $N$ random variables which are possibly correlated. Since $\mu_1\in \mathbb R^N$, $\mu_1$ is a random vector, and has a covariance matrix. Specifically, each realization of $\mu_1$ is some vector: $$ (X_1,X_2,\dots,X_N), $$ and the $(i,j)$ entry of the covariance matrix are $\text{Cov}(X_i,X_j)=E[X_iX_j]-E[X_i]E[X_j]$, for $i,j\in \{1,\dots,N\}$.

In the other answer, you seem to be saying that there cannot be a covariance matrix, since "there is only one random variable." However, I would say that there are $N$ random variables. Just like a vector in $\mathbb R^N$ has $N$-coordinates, so $N$ real numbers, a random vector in $\mathbb R^n$ will have $N$ random variables as coordinates. You seem to be referring to these as "sample observations," but that seems to gloss over the fact that the coordinates may be correlated.

Let us specialize to $N=2$ for a second. Just like a continuous random variable in one dimension should be thought of in terms of its pdf, which is a function defined on $\mathbb R^1$, a random vector in $\mathbb R^2$ should be thought of in terms of its pdf, which is a function defined on the plane $\mathbb R^2$. Here is an example of such a pdf for a normal distribution. Each realization of this random vector is some ordered pair of points, like $(52, 45)$, which gives rise to two random variables, $X_1=52$ and $X_2=45$. The numbers $52$ and $45$ are not different sample observations; they are coordinates of the same single sample observation. These two coordinate may have different means and variances; in the picture, the $X$ coordinate of that distribution has a mean of around $50$, while the $Y$ coordinate has a means near $40$.

enter image description here

0
On

When you have single random variable $(X)$ then this variable is described by mean $E(X)$, and $VAR(X)$. However, if you two random variables $(XY)$ then the mean is $E(X)E(Y)$ and the $COV(XY)$ is the covariance matrix. The diagonal elements of this 2x2 matrix (since we have two random variable $(XY)$ we have 2x2 covariance matrix), is the $VAR(X)$ and $VAR(Y)$. Now $μ ∈ R^N$ has N components it's a vector, implies $μ = [μ1, μ2, ...μN]^T$. We will have NxN covariance matrix $C1$.