iid Gaussians: $P(Y_1,\ldots,Y_N)=P(\bar{Y})$?

123 Views Asked by At

I came across the following statement in the book Probabilistic Machine Learning by Kevin Murphy (Eq. 3.65, page 90):

Let $\mathbf{Y}_1,\ldots,\mathbf{Y}_N$ be $D$-dimensional iid Gaussian random variables with mean $\mathbf{z}$ and covariance $\Sigma$. Then $P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|\mathbf{z})=\mathcal{N}(\bar{\mathbf{y}}|\mathbf{z},\frac{1}{N}\Sigma)$.

Below I show my proof of this statment. I guess that there should be a much simpler one, but I cannot find it. I would appreciate if someone could provide me a simpler proof.

Claim: Let $\mathbf{Y}_1,\ldots,\mathbf{Y}_N$ be $D$-dimensional iid Gaussian random variables with mean $\mathbf{z}$ and covariance $\Sigma$. Then

\begin{align*} P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|\mathbf{z})&=\mathcal{N}(\bar{\mathbf{y}}|\mathbf{z},\frac{1}{N}\Sigma) \end{align*}

Proof: $P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|\mathbf{z})=\prod_{i=1}^N\mathcal{N}(\mathbf{y}_i|\mathbf{z},\Sigma)=\prod_{i=1}^N\mathcal{N}(\mathbf{z}|\mathbf{y}_i,\Sigma)=\mathcal{N}(\mathbf{z}|\mathbf{\mu}_L,\Sigma_L)$

The second to last equality in the previous equation holds because the value at which a Gaussian density is evaluated and its mean are interchangable. The last equality holds because the product of Gaussian densities is a Gaussian density.

To find expressions for $\mathbf{\mu}_L$ and $\Sigma_L$ we complete squares on $P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|\mathbf{z})$.

$ \begin{align*} \log P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|\mathbf{z})&=\sum_{i=1}^N\log\mathcal{N}(\mathbf{y}_i|\mathbf{z},\Sigma)=\sum_{i=1}^N\log\mathcal{N}(\mathbf{z}|\mathbf{y}_i,\Sigma)\\ &=K_1-\frac{1}{2}\sum_{i=1}^N\left(\mathbf{z}^\intercal \Sigma^{-1}\mathbf{z}-\mathbf{z}^\intercal \Sigma^{-1}\mathbf{y}_i\right)\\ &=K_1-\frac{1}{2}\left(\mathbf{z}^\intercal N\Sigma^{-1}\mathbf{z}-\mathbf{z}^\intercal \Sigma^{-1}\sum_{i=1}^N\mathbf{y}_i\right)\\ &=K_1-\frac{1}{2}\mathbf{z}^\intercal N\Sigma^{-1}\mathbf{z}+\mathbf{z}^\intercal N\Sigma^{-1}\bar{\mathbf{y}} \end{align*} $

Then $\Sigma_L=\frac{1}{N}\Sigma$ and $\mathbf{\mu}_L=\bar{\mathbf{y}}$.

Therefore $P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|z)=\mathcal{N}(\mathbf{z}|\bar{\mathbf{y}},\frac{1}{N}\Sigma)=N(\bar{\mathbf{y}}|\mathbf{z},\frac{1}{N}\Sigma)$.

Thanks, Joaquin.

1

There are 1 best solutions below

1
On

A comment in response to my previous question was very illuminating. So, I summarize it here.

@Snoop provided a simple counter example to my previous claim:

Let $y_n$ and $z$ be scalars, take $N=2, y_1=y_2=z=0, \Sigma_y=1$. Then my previous claim states:

$\mathcal{N}(y_1 | z, \Sigma_y) \mathcal{N}(y_2 | z, \Sigma_y) = \mathcal{N}(\bar{y} | z, 1/2\ \Sigma_y)$

iff

$\mathcal{N}(0 | 0, 1) \mathcal{N}(0 | 0, 1) = \mathcal{N}(0 | 0, 1/2)$

iff

0.3989 0.3989 = 0.5642

iff

0.1591 = 0.5642

The last equality is false. Thus my claim is false.

One problem with my previous proof appears in its first line. The likelihood function $P(\mathbf{Y}_1=\mathbf{y}_1,\ldots,\mathbf{Y}_N=\mathbf{y}_N|\mathbf{z})$ at the far right of this line, as a function of $\mathbf{z}$, is not a pdf. However, $\mathbf{N}(\mathbf{z}|\mu_L,\Sigma_L)$ is a pdf.

Another problem is that the product of two Gaussian densities is not always a Gaussian density. For example, N(x|0, 1)*N(x|0, 1) is not a density, because it does not integrate to one, it integrates to $\frac{1}{2\sqrt\pi}$.