Replace likelihood of multiple gaussian observations with likelihood of mean?

56 Views Asked by At

Suppose I have N independent observations from a Gaussian data generation process with know variance $\sigma$. I would assume that the Bayesian likelihood function can be written as: $$ p(y | \mu, \sigma) = \prod_{i = 1}^N p(y_i| \mu, \sigma) = \prod_{i = 1}^N N(y_i| \mu, \sigma^2) $$

I'm reading a book where in this situation, the likelihood is actually stated as $$ p(y | \mu, \sigma) \propto N(\overline{y}| \mu, \sigma^2/N) $$ where $\overline{y}$ is the sample mean. It is stated that this is possible because $\overline{y}$ is a sufficient statistic.

Can somebody explain to me how this works? To me they do not look the same, since the first one still has some ordering on the $y_i$. Admittedly I'm not sure that this ordering matters since the $y_i$ are assumed to be exchangeable. Perhaps this could be illustrated for the case where $N = 2$? Thanks in advance.

1

There are 1 best solutions below

1
On BEST ANSWER

The variance of the distribution $\sigma$ is known, so it is a constant and the likelihood depends only on $\mu$. Furthermore, $N$ and $y_i$ are known from the data.

$$\begin{split}p(\textbf y|\mu)&=\frac 1 {(2\pi\sigma^2)^{N/2}}e^{-\frac 1{2\sigma^2}\sum_{i=1}^N (y_i-\mu)^2}\\ &\propto e^{-\frac1{2\sigma^2}\sum(y_i^2-2y_i\mu+\mu^2)}\\ &=e^{-\frac 1{2\sigma^2}(\sum y_i^2-2\mu\sum y_i+N\mu^2)\\ }\\ &\propto e^{-\frac 1{2\sigma^2}(-2\mu\sum y_i+N\mu^2)}\end{split}$$

Now for the proposed distribution

$$\begin{split}p(\bar y |\mu)&=\frac 1{(2\pi\sigma^2/N)^{1/2}}e^{-\frac N{2\sigma^2}(\bar y - \mu)^2}\\ &\propto e^{-\frac N{2\sigma^2}(\bar y^2-2\bar y \mu+\mu^2)}\\ &\propto e^{-\frac 1{2\sigma^2}(-2\mu\sum y_i+N\mu^2)}\end{split}$$

Since they are proportional to the same expression, $p(\textbf y|\mu, \sigma^2)\propto p(\bar y|\mu, \sigma^2/N)$