Is the joint PDF of two Normally Distributed variables a PDF?

319 Views Asked by At

Say I have two (somewhat related) random variables as follows:

$S \sim \mathcal{N}(s \mid \mu^2, \sigma^2) \implies \mathcal{P}(s) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(s-\mu)^2}{2\sigma^2}}$ $\begin{align*} P \sim \mathcal{N}(p\mid s,\beta^2) &\implies \mathcal{P}(p) = \frac{1}{\sqrt{2\pi\beta^2}} e^{-\frac{(p-s)^2}{2\beta^2}}\\ &\implies \mathcal{P}(p) = \frac{1}{\sqrt{2\pi(\sigma^2+\beta^2)}} e^{-\frac{(p-\mu)^2}{2 (\sigma^2 +\beta^2)}} \end{align*}$

and I want to see what $\mathcal P(s, p)$ looks like.

From the standard definition of a joint distribution:

https://en.wikipedia.org/wiki/Joint_probability_distribution#Continuous_case

I can surmise that:

$\mathcal P(s, p) = \mathcal P(p\mid s)\mathcal P(s)$

which is the product of two Gaussian forms. I am further given to know (in Wikipedia and other references) that the results $\mathcal P(s, p)$ is a PDF and hence:

$$\int_{-\infty}^\infty \int_{-\infty}^\infty \mathcal P(s,p) \, ds \, dp = 1$$

But, if I multiply the two Gaussians together and simplify, I end up with a result that is, not a PDF! Much rather it is a scaled version of a PDF. And Bromily basically does all the same algebra and simplifications and summarises them neatly with the same conclusion here:

http://www.tina-vision.net/docs/memos/2003-003.pdf

Bromily concludes that: "the product of two Gaussian PDFs $f(x)$ and $g(x)$ is a scaled Gaussian PDF and the scaling factor $S$ is itself a Gaussian PDF on both $\mu_f$ and $\mu_g$ and with standard deviation $\sqrt{\sigma_f^2+\sigma_g^2}$."

The key observation being that the integral is not 1 it is not a PDF, there is this scaling factor that emerges from the multiplication.

So who is right? Or better said given I trust both are right, what error am I making in equating Bromily's result with the definition of a joint PDF? Where am I in error?

If it's any help I have my whole conundrum summarised here:

https://drive.google.com/open?id=1f2ZevrUoPWmQegmiXYBxigkXDqoCIwFH

with the complete derivations bar the last step where I defer to Bromily as it got messy fast.

2

There are 2 best solutions below

0
On BEST ANSWER

The source of confusion has now been identified and resolved as follows. It centered on Bromily's conclusion that:

"the product of two Gaussian PDFs $f(x)$ and $g(x)$ is a scaled Gaussian PDF and the scaling factor S is itself a Gaussian PDF on both $μ_f$ and $μ_g$ and with standard deviation $\sqrt{σ^2_f+σ^2_g}$."

and misinterpretation of it. Essentially, he presents the product in a new form which is itself a product of two Guassians cast in new terms. The both integrate over the range of $-\infty$ to $\infty$ to 1.

The key observation is that the scaling factor is not a constant, but itself a function. In the specific TrueSkill case I was working on this translates to:

$\mathcal P(s,p) = \mathcal P(p \mid s)\mathcal P(s)= \gamma \mathcal N (s \mid \mu_{sp}, \sigma_{sp}^2)$

Where $\gamma$ is the scaling factor and $\mathcal N (s \mid \mu_{sp}, \sigma_{sp}^2)$ is the Gaussian PDF and:

$\mu_{sp} = \frac { \beta^2\mu + \sigma^2 p } {\sigma^2 + \beta^2}$

$\sigma_{sp}^2 = \frac {\sigma^2 \beta^2} {\sigma^2 + \beta^2}$

$\gamma = \frac {1} { \sqrt{2\pi(\sigma^2+\beta^2) }} e^{ -\frac{(\mu-p)^2}{2(\sigma^2+\beta^2)} }$

And so we see that the scaling factor $\gamma$ is not a constant but rather a function of $p$. And integrating it with respect to p over the range of $-\infty$ to $\infty$ also yields 1 (as it is in the general form of a Normal PDF).

To wit, no conflict exists, between the general claim and application of Bromiley's result for the Gaussian multiplication and confusion arose form th premise that the scaling factor was a constant.

Important also in understanding the breakdown I found is to note that a joint PDF (with two random variables) describes a surface (is a function of two independent variables) and must be integrated twice to yield the volume beneath the surface which by definition is 1. A standard PDF by comparison is a an ordinary function (a line) and we need integrate it only once with respect to its independent variable to obtain the area beneath the line, which must, over the range of possible values, be 1, for it to be a PDF)

When we integrate Bromiley's result twice to remove both unknowns, we also see a result of 1, and all is good. Integrating it only once leaves us with a function (another PDF) not with a result. And the scaling factor is in fact a function (and to my mind poorly named as a scaling factor for that reason but that is an aside).

5
On

Your error is in applying Bromiley's result to a situation where $\mathcal P(p\mid s)$ is a function of both $p$ and $s$. In other words, $p$ is not constant when we assert that $\mathcal P(s, p) = \mathcal P(p\mid s)\mathcal P(s)$ has a joint Gaussian distribution; instead the joint pdf $\mathcal P(s, p)$ is jointly Gaussian as a function of two variables.

If you regard $\mathcal P(p\mid s)$ as a function of $s$ only, with $p$ constant, then yes, the product of $\mathcal P(p\mid s)$ with $\mathcal P(s)$, when regarded as a function of $s$ alone, is not a density, as Bromiley shows. But nobody is claiming that $\mathcal P(p\mid s)\mathcal P(s)$ is a density when regarded as a function of $s$ only.

Bromiley remarks that the scaling constant $\gamma$ does involve both $\mu_f$ and $\mu_g$. This means that $p$ is hiding in the scaling constant, since your summarization document renames $p$ as $\mu_g$. So the fact that $\mathcal P(p\mid s)\mathcal P(s)$ is not a density when viewed as a function of $s$ alone does not contradict the fact that it is a density in two variables $s$ and $p$.

(To obtain the marginal density of $P$, you must integrate out $s$ via ${\mathcal P}(p) = \int_s \mathcal P(p\mid s)\mathcal P(s)\,ds$. The result of this integration will be a density. The marginal density of $S$ is ${\mathcal P}(s)$ as given.)