Product of Exponential and Gaussian

135 Views Asked by At

I am trying to solve exercise 3.6 of Ma, Goldreich, Kording's 2022 book Bayesian models of perception and action.

Essentially, I need to find the posterior mean estimate of an exponential stimulus distribution $p(s) = \lambda e^{- \lambda}\quad \text{with} \, \lambda > 0, \, \text{for} \, s \geq 0$ and a Gaussian measurement distribution $p(x|s)$ with mean $s$ and variance $\sigma^2$.

In the context of the book, a Bayesian observer infers $s$ from a measurement $x_{obs}$.

Thus, the posterior is given by Bayes' rule:

$$ \begin{equation} p(s|x_{obs}) = \frac{p(x_{obs}|s) p(s)}{p(x_{obs})} \end{equation} $$

And involves the product of the Gaussian $p(x|s)$ and exponential $p(s)$, which is where I am stuck. Specifically, I can not correctly get the posterior mean. According to a numerical solution, the product should be another Gaussian with mean $\mu_{post} = x_{obs} - \sigma^2 \lambda$. However, I keep getting the result $\mu_{post} = x_{obs} + \sigma^2 \lambda$ and have been trying to hunt down a sign error for a few days, to no avail.

Here is how I got the result, following the technique introduced in the book:

Plugging the Gaussian and exponential into Bayes' rule above:

$$ p(s|x_{obs}) = \frac{\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x_{obs}-s)^2}{2\sigma^2}} \lambda e^{-\lambda s}}{p_x(x_{obs})} \\ $$

Ignoring any normalization constants for now and just focusing on the sum of exponents (soe):

$$ \begin{align} \text{soe} &= {-\frac{(x_{obs}-s)^2}{2\sigma^2}} - \lambda s \\ &= {-\frac{x_{obs}^2 - 2x_{obs}s + s^2}{2\sigma^2}} - \lambda s \\ &= {-\Big(\frac{x_{obs}^2}{2\sigma^2} - \frac{2x_{obs}s}{2\sigma^2} + \frac{s^2}{2\sigma^2}}\Big) - \frac{2\sigma^2\lambda s}{2\sigma^2} \\ &= {-\frac{x_{obs}^2}{2\sigma^2} + \frac{2x_{obs}s}{2\sigma^2} - \frac{s^2}{2\sigma^2}} - \frac{2\sigma^2\lambda s}{2\sigma^2} \\ &= s^2 \Big(-\frac{1}{2\sigma^2}\Big) + s \Big(\frac{2x_{obs}}{2\sigma^2} -\frac{2\sigma^2\lambda}{2\sigma^2}\Big) -\frac{x_{obs}^2}{2\sigma^2} \\ &= s^2 \Big(-\frac{1}{2\sigma^2}\Big) + s \Big(\frac{x_{obs}-\sigma^2\lambda}{\sigma^2}\Big) -\frac{x_{obs}^2}{2\sigma^2} \\ \text{Now, completing the square with} \\ a &= -\frac{1}{2\sigma^2} \\ b &= \frac{x_{obs}-\sigma^2\lambda}{\sigma^2} \\ c &= -\frac{x_{obs}^2}{2\sigma^2} \\ \text{and writing} \quad as^2 + bs + c \quad &\text{as} \quad a(s + \frac{b}{2a})^2 + c - \frac{b^2}{4a} \\ \text{soe} &= -\frac{1}{2\sigma^2}\Bigg(s + \frac{\frac{x_{obs}-\sigma^2\lambda}{\sigma^2}}{-2\frac{1}{2\sigma^2}}\Bigg)^2 -\frac{x_{obs}^2}{2\sigma^2} - \frac{\Big(\frac{x_{obs}-\sigma^2\lambda}{\sigma^2}\Big)^2}{-4\frac{1}{2\sigma^2}} \\ &= -\frac{1}{2\sigma^2}\Bigg(s +\frac{-\sigma^2x_{obs}+\sigma^4\lambda}{\sigma^2} \Bigg)^2 -\frac{x_{obs}^2}{2\sigma^2} - \Bigg(\frac{x_{obs}-\sigma^2\lambda}{\sigma^2}\Bigg)^2-\frac{\sigma^2}{2} \\ &= -\frac{1}{2\sigma^2}\Bigg(s - x_{obs}+\sigma^2\lambda \Bigg)^2 -\frac{x_{obs}^2}{2\sigma^2} - \Bigg(\frac{x_{obs}-\sigma^2\lambda}{\sigma^2}\Bigg)^2-\frac{\sigma^2}{2} \\\ \end{align} $$

Crucially, now only the first term is of interest since the rest does not depend on $s$ and will be eaten up by the normalization factor eventually. From my understanding, this term is the exponent of the Gaussian posterior $p(s|x_{obs})$ in terms of the variable $s$ and posterior mean $\mu_{post} = x_{obs} + \sigma^2\lambda$. However, as mentioned above, testing this result numerically does not hold up, but gives $\mu_{post} = x_{obs} - \sigma^2\lambda$.

1

There are 1 best solutions below

1
On BEST ANSWER

It strikes me as much easier to complete the square only on the numerator (where I will write $x$ for $x_{\text{obs}}$ for simplicity):

$$p(s \mid x) \propto \exp\left(-\frac{s^2 - 2(x - \sigma^2 \lambda)s + x^2}{2\sigma^2}\right).$$ The numerator may be written as $$\left(s - (x-\sigma^2 \lambda)\right)^2 + x^2 - (x - \sigma^2 \lambda)^2,$$ and at this point we can immediately conclude the posterior mean must be $x - \sigma^2 \lambda$ as indicated by the text, because the remaining terms are not functions of $s$, hence they contribute some constant multiplicative factor to the posterior distribution of $s$ that, after normalizing $p$ into a proper density, will reveal that $p$ has the kernel of a normal distribution.

In fact, your own computations show that the posterior mean is $x - \sigma^2 \lambda$: you write $$\text{soe} = -\frac{1}{2\sigma^2}\Bigg(s \color{red}{- x_{obs}+\sigma^2\lambda} \Bigg)^2 -\frac{x_{obs}^2}{2\sigma^2} - \Bigg(\frac{x_{obs}-\sigma^2\lambda}{\sigma^2}\Bigg)^2-\frac{\sigma^2}{2}$$ and as you noted, only this first term is relevant. So if we require $(s - x_{obs} + \sigma^2 \lambda)^2 = (s - \mu_{post})^2$, then clearly $\mu_{post} = x_{obs} - \sigma^2 \lambda$.