Simplify maximum a posteriori

70 Views Asked by At

I have the following example of a maximum a posteriori...

$$\prod_{n=1}^N \mathcal{N}(y_n|\beta x_n,\sigma^2) \mathcal{N}(\beta|0,\lambda^{-1})$$

Where I am multiplying the prior by the likelihood function. This can be simplified to the following by taking the logarithm...

$$\sum_{n=1}^N -\frac{1}{\sigma^2}(y_n-\beta x_n)^2 - \lambda \beta^2 + \mbox{const}.$$

I get that we can get...

$$\prod_{n=1}^NP => \sum_{n=1}^N\log(P)$$

but that's as far as I can get. I'm not sure how the rest of the values were isolated. Can someone please show me the intermediary steps so that I can understand?

1

There are 1 best solutions below

0
On BEST ANSWER
  • I presume the original question was that $Y \sim \mathcal{N}(\beta X,\sigma^2)$ with the prior for $\beta \sim \mathcal{N}(0,\lambda^{-1})$ with some given $\sigma^2$ and $\lambda$.

  • So the prior density for $\beta$ is to $\frac{\lambda}{\sqrt{2\pi}}e^{-\lambda \beta^2/2}$, i.e. proportional to $e^{-\lambda \beta^2/2}$ removing a multiplicative constant not involving $\beta$.

  • Similarly the likelihood when observing $(x_n,y_n)$ is proportional to $\frac{1}{\sqrt{2\pi \sigma^2}}e^{-(x_n- \beta y_n)^2/(2\sigma^2)}$ and so the likelihood of observing all $N$ of the pairs is proportional to $\prod\limits_{n=1}^N e^{-(x_n- \beta y_n)^2/(2\sigma^2)} = e^{-\sum\limits_{n=1}^N (x_n- \beta y_n)^2/(2\sigma^2)}$, again removing a multiplicative constant not involving $\beta$.

  • This makes the posterior density proportional to the product of the prior and the likelihood so proportional to $e^{-\lambda \beta^2/2} e^{-\sum\limits_{n=1}^N (x_n- \beta y_n)^2/(2\sigma^2)} = e^{-\lambda \beta^2/2-\sum\limits_{n=1}^N (x_n- \beta y_n)^2/(2\sigma^2)}$.

  • So the logarithm of the posterior density is $-\lambda \beta^2/2-\sum\limits_{n=1}^N (x_n- \beta y_n)^2/(2\sigma^2)$ plus a constant, which is what you have apart from a factor of $\frac12$ which makes no difference to the maximisation.

  • The next step is the take the derivative with respect to $\beta$ for give $-\lambda \beta + \sum\limits_{n=1}^N \frac{x_n y_n}{\sigma^2} -\sum\limits_{n=1}^N \frac{\beta y_n}{\sigma^2}$, which is zero when $\beta = \dfrac{\sum\limits_{n=1}^N x_n y_n}{\lambda \sigma^2 + \sum\limits_{n=1}^N y_n}$ and you can show this is the maximum a posteriori estimate of $\beta$