Normal distribution with standard deviation = I

4.7k Views Asked by At

Suppose a vector $\epsilon \in \mathbb R^d$ is a random vector drawn from the isotropic normal distribution:

$\epsilon$ ~ N$\mathcal(0, I)$

[As in Eq. 1.34 here.]

I suppose I is the identity matrix; I don't understand what it means to draw a vector out of a normal distribution with $\mu = 0$ and and $\sigma = I$ - i.e., when the standard deviation is a matrix. Any explanations to this?

2

There are 2 best solutions below

2
On BEST ANSWER

In this particular case, it means that you draw $d$ times a $N(0,1)$ (real) random variable, and these random variables are independent.

It means that $\varepsilon=(\varepsilon_1,\dots,\varepsilon_d)$, the $\varepsilon_i$ are independent, and are normally distributed, with variance $1$ and mean $0$.

If you don't have the identity matrix but another symmetric positive $d\times d$ matrix $\Sigma_{ij}$, then you'd have $\Sigma_{ij}=Cov(\varepsilon_i,\varepsilon_j)$. You can also have a non-zero mean vector $\mu=(\mu_1,\dots,\mu_d)$, with $\mu_i=E(\varepsilon_i)$. In this case you'd write $\epsilon\sim N(\mu,\Sigma)$.

0
On

Technically, the definition of the vector Gaussian likelihood density when you have mean 0 and covariance matrix $K$ is proportional to $f(x) = e^{-(1/2)x^T K^{-1} x}$, with constant of proportionality determined by $K$ so that it is truly a probability density. This is just how it's defined, and if you know numerical sampling methods like MCMC then this is all you need to know to get samples. If $K = I$ this is saying that the random vector is simply a sample where each dimension is sampled as an independent $N(0,1)$ one-dimensional random variable. You can factor $f(x)$ in this case to see that this is true.

However, it is a fact that any covariance matrix $K$ can be written as $K = A^TA$ for a square matrix $A$. Then, when you plug that into the density formula, you realize that a sample from this Gaussian can be obtained simply by sampling normal $N(0,1)$ independent individual variables, one for each dimension, call this random vector $y$ and then applying the linear transformation $x = Ay$ to get the sample from the Gaussian distribution with covariance matrix $K$. In practice this is how random samples are "efficiently" taken from an arbitrary Gaussian with covariance matrix $K$.