Interpreting the product of two Gaussians

3.9k Views Asked by At

I cam across an interesting claim that:

$\mathcal N (\mu_f, \sigma_f^2) \; \mathcal N (\mu_g, \sigma_g^2) = \mathcal N \left(\frac{\mu_f\sigma_g^2+\mu_g\sigma_f^2}{\sigma_f^2+\sigma_g^2}, \frac {\sigma_f^2\sigma_g^2}{\sigma_f^2+\sigma_g^2}\right)$

In trying to understand it I consulted Bromiley:

http://www.tina-vision.net/docs/memos/2003-003.pdf

Bromiley concludes that:

if

$f(x) = \frac{1}{\sqrt{2\pi\sigma_f^2}} e^{-\frac{(x-\mu_f)^2}{2 \sigma_f^2}}$ and $g(x) = \frac{1}{\sqrt{2\pi\sigma_g^2}} e^{-\frac{(x-\mu_g)^2}{2 \sigma_g^2}}$

then:

$f(x)g(x) = D_{fg} \frac{1}{\sqrt{2\pi\sigma_{fg}^2}} e^{- \frac { (x - \mu_{fg})^2 } {2 \sigma_{fg}^2 } }$

where:

$\mu_{fg} = \frac { \sigma_g^2\mu_f + \sigma_f^2 \mu_g } {\sigma_f^2 + \sigma_g^2}$ and $\sigma_{fg}^2 = \frac {\sigma_f^2 \sigma_g^2} {\sigma_f^2 + \sigma_g^2}$

$S_{fg} = \frac {1} {\sqrt{2\pi(\sigma_f^2+\sigma_g^2)}} e^{ -\frac{(\mu_f-\mu_g)^2}{2(\sigma_f^2+\sigma_g^2)} }$

Note that if $\mu_f$, $\mu_g$ , $\sigma_f$ and $\sigma_f$ are known constants then the $S_{fg}$ is a known constant too.

To wit, if I cast Bromiley's result in the format of the claim I'm exploring:

$\mathcal N (\mu_f, \sigma_f^2) \; \mathcal N (\mu_g, \sigma_g^2) = S_{fg} \; \mathcal N \left(\frac{\mu_f\sigma_g^2+\mu_g\sigma_f^2}{\sigma_f^2+\sigma_g^2}, \frac {\sigma_f^2\sigma_g^2}{\sigma_f^2+\sigma_g^2}\right)$

In short there is a constant scaling factor $S_{fg}$. In fact Bromiley describes the product as a scaled Gaussian.

Given $f(x)$ and $g(x)$ are both functions of $x$ the original claim, which reads (as a reminder):

$\mathcal N (\mu_f, \sigma_f^2) \; \mathcal N (\mu_g, \sigma_g^2) = \mathcal N \left(\frac{\mu_f\sigma_g^2+\mu_g\sigma_f^2}{\sigma_f^2+\sigma_g^2}, \frac {\sigma_f^2\sigma_g^2}{\sigma_f^2+\sigma_g^2}\right)$

implies that:

$\int_{-\infty}^{\infty} f(x) g(x) \;dx = 1$

But Bromiley's result suggests this implication is false. I presume it inetgrates to S_{fg}, or:

$\int_{-\infty}^{\infty} f(x) g(x) \;dx = S_{fg}$

My tentative conclusion is that the claim I am exploring is false, and my questions would be:

  1. Is my tentative conclusion true? (is the explored claim false?)
  2. Am I right in concluding the integral would be $S_{fg}$?

Those are the areas I'm a little shakey on at present and seek some review on I guess.

2

There are 2 best solutions below

0
On BEST ANSWER

We have an insight into a solution to this confusion from two observations:

  1. The $\mathcal N(\mu, \sigma)$ notation is a loose convention. I find no clear definition of it anywhere. A good reference is the Wikipedia mention here: https://en.wikipedia.org/wiki/Normal_distribution#Notation

  2. At least one author's dive into the same apparent confusion at: https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/04-One-Dimensional-Kalman-Filters.ipynb where Roger Labbe writes:

$\begin{aligned} \mathcal N(\mu, \sigma^2) &= \| prior \cdot likelihood \|\\ &=\mathcal{N}(\bar\mu, \bar\sigma^2)\cdot \mathcal{N}(\mu_z, \sigma_z^2) \\ &= \mathcal N(\frac{\bar\sigma^2 \mu_z + \sigma_z^2 \bar\mu}{\bar\sigma^2 + \sigma_z^2},\frac{\bar\sigma^2\sigma_z^2}{\bar\sigma^2 + \sigma_z^2}) \end{aligned}$

The whole confusion goes away if we table a formal definition of $\mathcal N(\mu, \sigma)$ as:

$\mathcal N(x \mid \mu, \sigma^2) = \| e^{-\frac{(x-\mu)^2}{2 \sigma^2}} \|$

given:

$\|f(x)\| \implies \frac {f(x)} {c}$

where $c$ is a normalisation constant, such that:

$\int_{-\infty}^{\infty} \|f(x)\|\,dx = 1$ , then in the case of a standard univariate Gaussian distribution:

$c= \frac {1} {\sqrt{2 \pi \sigma^2}}$

In the case of the product of two Gaussian distributions $\mathcal N(x \mid \mu_f, \sigma_f) \cdot \mathcal N(x \mid \mu_g, \sigma_g)$:

$c = \frac {1} {2\pi\sigma_f\sigma_g} e^{ \frac { \mu_{fg}^2 + \mu_{fg} } {2 \sigma_{fg}^2 }}$ where: $\mu_{fg} = \frac { \sigma_g^2\mu_f + \sigma_f^2 \mu_g } {\sigma_f^2 + \sigma_g^2}$ and $\sigma_{fg}^2 = \frac {\sigma_f^2 \sigma_g^2} {\sigma_f^2 + \sigma_g^2}$

and:

$\mathcal N(x \mid \mu_f, \sigma_f^2) \cdot \mathcal N(x \mid \mu_g, \sigma_g^2) = \mathcal N(x \mid \mu_{fg}, \sigma_{fg}^2)$

becomes a true and consistent claim (as we have embraced normalisation into the definition of the function $\mathcal N$.

It would be prudent to note at the same time difference between a frequency distribution (Gaussian) and a Probability Density Function (PDF).

The Gaussian function (a frequency distribution) is:

$f(x) = a e^{- \frac{(x-b)^2 }{ 2 c^2} }$ or $f(x) = a e^{- \frac{(x-\mu)^2 }{ 2 \sigma^2} }$

See: https://en.wikipedia.org/wiki/Gaussian_function

And the Normal distribution (a probability density function, PDF) is:

$\mathcal N(x \mid \mu, \sigma^2) = \| e^{-\frac{(x-\mu)^2}{2 \sigma^2}} \|$

or

$\mathcal N(x \mid \mu, \sigma^2) = \frac {1} {c} e^{-\frac{(x-\mu)^2}{2 \sigma^2}}$

and $c$ is the normalizing constant. See: https://en.wikipedia.org/wiki/Normalizing_constant

that makes $\mathcal N$ a PDF (by ensuring its integral over its range of definition is 1, in other words, that the total probability over all possible outcomes is certain, 100% or 1 in standard algebraic contexts).

Thus by explicitly defining $\mathcal N$ the original claim is true and consistent, as is Roger Labbe's text - and this definition would seem acceptably in line with the current loose usage of the notation $\mathcal N(x \mid \mu, \sigma^2)$ encountered in the literature.

If a conflicting formal definition of $\mathcal N(x \mid \mu, \sigma^2)$ exists, speak up and identify it.

Note: Fixed minor typo in latex subscript. Canyon289 is typing this note to meet min character limit for edits

0
On

My (subjective) opinion:

Do write

$$ \mathcal{N}(x \mid \mu_1,\sigma_1^2)\cdot\mathcal{N}(x \mid \mu_2,\sigma_2^2) $$

Don't write

$$ \mathcal{N}(\mu_1,\sigma_1^2)\mathcal{N}(\mu_2, \sigma^2) $$


The notation $\mathcal{N}(x|\mu_1,\sigma_1^2)$ makes it clear that this is a function, this particular function can have a particular significance in a statistical/probability setting question but that does not need to be the case, it is simply a function and the product of functions is a well defined and familiar operation with little ambiguity.

On the other hand the notation $\mathcal{N}(\mu, \sigma^2)$ is to be used as short hand to make $X \sim \mathcal{N}(\mu, \sigma^2)$ equivalent to "the random variable $X$ has a normal distribution with mean $\mu$ and variance $\sigma^2$".

In this setting $\mathcal{N}(\mu, \sigma^2)$ is not a function, it isn't even a distribution function it is a symbolic representation of a particular statement, a sign post that allows us to retrieve certain information should we wish, but it is not a mathematical object for which we have a well defined product.