Deriving marginal probability $p(y) = \int_x \mathcal{N}(y | Ax+b) \mathcal{N}(x) dx$

Question

Deriving marginal probability $p(y) = \int_x \mathcal{N}(y | Ax+b) \mathcal{N}(x) dx$

62 Views Asked by Bumbble Comm At 09 Apr 2026 - 12:38

In the context of Maximum Likelihood PCA, we have a Normal distributed variable:

\begin{align} p(x) &= \mathcal{N}(x | \mu, \Lambda^{-1}) \end{align}

that we use to create another Normal variable:

\begin{align} p(y | x) &= \mathcal{N}(y | Ax + b, L^{-1}) \end{align}

We are interested in the marginal distribution

$$ p(y) = \int_x p(y| x) p (x) dx = \mathcal{N}(y | A\mu + b, L^{-1} + A\Lambda^{-1}A^T) $$

While there are some simple ways to demonstrate this result, I'm trying an explicit demonstration. I multiply both distributions, and I try to play with the terms in the exponent, completing the squares and so on. After many tries, I've surrendered.

Can someone give a derivation or give some reference where it is explicitly done?

(The above equations are taken from Bishop. Pattern Recognition and Machine Learning, page 689)

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

$$p(y) = \int_x p(y| x) p (x) dx = \int_x \mathcal{N}(y | Ax + b, L^{-1})\mathcal{N}(x | \mu, \Lambda^{-1})dx$$ After expanding and factorizing, you can show (by removing terms that do not depend on $x$) $$p(y) \propto \int_x exp ( - \frac{1}{2} x^T \Gamma x + c^Tx ) dx$$ where $\Gamma = A^TLA + \Lambda$ and $c = A^TL(y-b) + \Lambda\mu$.

Referring to [https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf], the above integral gives $$p(y) \propto exp \big(-\frac{1}{2} (y^TLy - 2y^TLb) \big).exp (\frac{1}{2} c^T \Gamma^{-T} c ) $$

So \begin{equation} \begin{split} log(p(y)) & \propto -\frac{1}{2} (y^TLy - 2y^TLb -c^T \Gamma^{-1} c) \end{split} \end{equation} Keeping terms that depend on $y$, you get \begin{equation} \begin{split} log(p(y)) &\propto -\frac{1}{2}[y^T \Big(L - LA\Gamma^{-1}A^TL \Big) y - 2y^T\Big(Lb - LA\Gamma^{-1}(A^TLb - \Lambda\mu)\Big) ] \\&= -\frac{1}{2}[y^T \Big(L - LA\Gamma^{-1}A^TL \Big) y - 2y^T\Big( \big(L - LA\Gamma^{-1}A^TL \big)b + LA\Gamma^{-1}\Lambda\mu\Big)] \end{split} \end{equation} Note that by Matrix Inversion Lemma: $$(*)\quad \Big(L - LA\Gamma^{-1}A^TL \Big) = \Big( L^{-1} + A\Lambda^{-1}A^T \Big)^{-1} = \Omega^{-1}$$ Also note \begin{equation}\label{eq3.1.1} \begin{split} (**)\quad LA\Gamma^{-1}\Lambda\mu &= LA(A^TLA + \Lambda)^{-1}\Lambda\mu \\&= LA(\Lambda^{-1} - \Lambda^{-1}A^{T}\Omega^{-1}A\Lambda^{-1})\Lambda\mu \qquad \text{(By Matrix Inversion Lemma)} \\&=(L - LA\Lambda^{-1}A^{T}\Omega^{-1})A\mu \\&=(L\Omega - LA\Lambda^{-1}A^{T})\Omega^{-1}A\mu \\&=\Big(L(L^{-1} + A\Lambda^{-1}A^T ) - LA\Lambda^{-1}A^{T} \\&=\Big(I + LA\Lambda^{-1}A^T - LA\Lambda^{-1}A^{T} \Big)\Omega^{-1}A\mu \\&=\Omega^{-1}A\mu \end{split} \end{equation} Plug $(*)$ and $(**)$ in $log(p(y))$, you get \begin{equation} \begin{split} log(p(y)) &\propto -\frac{1}{2}[y^T \Big(L - LA\Gamma^{-1}A^TL \Big) y - 2y^T\Big(Lb - LA\Gamma^{-1}(A^TLb - \Lambda\mu)\Big) ] \\&= -\frac{1}{2}[y^T \Omega^{-1} y - 2y^T\Big( \Omega^{-1}b + \Omega^{-1}A\mu \Big)] \\&= -\frac{1}{2}[y^T \Omega^{-1} y - 2y^T\Omega^{-1}\Big( A\mu + b \Big)] \\&\propto -\frac{1}{2}\Big( \big(y - (A\mu + b)\big)^T\Omega^{-1} \big(y - (A\mu + b)\big)\Big) \end{split} \end{equation} And you're done.

Deriving marginal probability $p(y) = \int_x \mathcal{N}(y | Ax+b) \mathcal{N}(x) dx$

There are 1 best solutions below

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in NORMAL-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions