Marginalization of Gaussian canonical form

470 Views Asked by At

I'm struggling with deriving the marginalization of Gaussian canonical form.

Suppose a joint Gaussian $x = [x_1 \ x_2]^T$ in the moment form \begin{align} p(x;\mu,\Sigma) = \frac{1}{\sqrt{(2\pi)^d|\Sigma|}} \exp \left( -\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu) \right), \end{align} where \begin{align} \mu=\begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}, \end{align} \begin{align} \Sigma=\begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix}. \end{align} Its canonical form is \begin{align} p(x;\eta,\Lambda) = \frac{|\Lambda|}{\sqrt{(2\pi)^d}} \exp \left( -\frac{1}{2}x^T \Lambda x + \eta^T x - \frac{1}{2} (\eta^T \Lambda \eta) \right), \end{align} where \begin{align} \eta = \Sigma^{-1} \mu = \begin{bmatrix} \eta_1 \\ \eta_2 \end{bmatrix}, \end{align} \begin{align} \Lambda = \Sigma^{-1} = \begin{bmatrix} \Lambda_{11} & \Lambda_{12} \\ \Lambda_{21} & \Lambda_{22} \end{bmatrix}. \end{align} We can use elementary transformation to derive \begin{align} \Lambda = \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix}^{-1} = \begin{bmatrix} \Sigma_{11}^{-1}+\Sigma_{11}^{-1}\Sigma_{12} (\Sigma/\Sigma_{11}) \Sigma_{21}\Sigma_{11}^{-1} & -\Sigma_{11}^{-1}\Sigma_{12}(\Sigma/\Sigma_{11}) \\ -(\Sigma/\Sigma_{11})\Sigma_{21}\Sigma_{11} & (\Sigma/\Sigma_{11}) \end{bmatrix}, \end{align} where $(\Sigma/\Sigma_{11})=\Sigma_{22}-\Sigma_{21}\Sigma_{11}^{-1}\Sigma_{12}$ is the Schur complement of $\Sigma$ w.r.t. $\Sigma_{11}$.

With the above notation, we can write the marginal and conditional in the moment form as follows. \begin{align} \mu_1^\text{Marg} &= \mu_1 \\ \Sigma_1^\text{Marg} &= \Sigma_{11} \end{align} \begin{align} \mu_{2|1}^\text{Cond} &= \mu_2 + \Sigma_{21}\Sigma_{11}^{-1}(x_1-\mu_1)\\ \Sigma_{2|1}^\text{Cond} &= (\Sigma/\Sigma_{11}) = \Sigma_{22}-\Sigma_{21}\Sigma_{11}^{-1}\Sigma_{12}. \end{align}

I understand that with the moment form, this can be done by manipulating the quadratic polynomial in the exponential into $p(X_1)$ and $p(X_2 \mid X_1)$. But I'm stuck at deriving the marginal and conditional in the canonical form \begin{align} \Lambda_{2|1}^\text{Cond} &= \Lambda_{22} \\ \eta_{2|1}^\text{Cond} &= \eta_2 - \Lambda_{21} x_1 \end{align} \begin{align} \Lambda_{1}^\text{Marg} &= \Lambda_{11} - \Lambda_{12} \Lambda_{22}^{-1} \Lambda_{21} \\ \eta_{1}^\text{Marg} &= \eta_1 - \Lambda_{12} \Lambda_{22}^{-1} \eta_2. \end{align}

I guess I should also begin with the joint distribution and do some manipulation to obtain $p(X_1)$ and $p(X_2 | X_1)$ \begin{align} p(x;\eta,\Lambda) = \frac{|\Lambda|}{\sqrt{(2\pi)^d}} \exp \left( -\frac{1}{2} \begin{bmatrix}x_1 \\ x_2\end{bmatrix}^T \begin{bmatrix} \Lambda_{11} & \Lambda_{12} \\ \Lambda_{21} & \Lambda_{22} \end{bmatrix}\begin{bmatrix}x_1 \\ x_2\end{bmatrix} + \begin{bmatrix}\eta_1 \\ \eta_2\end{bmatrix}^T \begin{bmatrix}x_1 \\ x_2\end{bmatrix} - \frac{1}{2} (\eta^T \Lambda \eta) \right) \end{align}

But I cannot split this thing into the correct marginal and conditional. Any help would be appreciated!

Reference: https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/other-readings/chapter13.pdf

1

There are 1 best solutions below

3
On

Since the vector $X=[X_1,...,X_d]^T$ is normal, its elements are jointly normal, i.e., any linear combination of them is normal. Let $a=[1, 0, ...,0]^T$, then $X_1 = a^T X$ is normal with mean and variance as follows.

$\mu_{X_1}=E\{X_1\}=a^T E\{X\} = a^T \mu = \mu_1$, and

$\sigma_{X_1}^2=E\{(X_1-\mu_1)(X_1-\mu_1)^T\} = a^TE\{(X-\mu)^T(X-\mu)\}a=a^T\Sigma a=\Sigma_{11}$.

Choosing $a\in \{0,1\}^d$ accordingly, you may derive all marginal distribution. As for the conditional distributions, applying $a$ results in the omission of the rows and columns of $\Sigma$ corresponding to the non-zero elements of $a$. Since the determinant of product is the product of determinants, the normalizing factor also checks out.