Normal distribution conditional on a sub-affine space

174 Views Asked by At

This is related to this problem. Now let's for the time being keep aside the unorthodox "probabilistic" statements and focus on the canonical deterministic part only. Suppose $w\sim N(\mu, \Sigma)$ is multivariate normal, and let $P$ be a known matrix, not necessarily square or invertible, and $q$ be a known vector such that $Pw=q$ has solutions, then what is the conditional distribution $w\mid \{Pw=q\}$ or $\Bbb E(w\mid Pw=q)$?

In the original Black-Litterman paper (p35) the authors claimed that the conditional is again normal and that $$\Bbb E(w\mid Pw=q) = \mu + \Sigma P^T (P\Sigma P^T)^{-1} (q - P\mu)$$ and can be obtained by solving the following optimisation problem $$ \begin{align} \min \quad & (x - \mu)^T\Sigma^{-1}(x-\mu)\\ \text{s.t.}\quad & Px=q \end{align} $$ Is their claim valid? And would you mind elaborating a bit on why it works? Thanks!

1

There are 1 best solutions below

9
On BEST ANSWER

Let $W \sim \mathcal{N}(\mu, \Sigma)$ and write $W = \mu + \Sigma^{1/2}Z$, where $\Sigma^{1/2}$ is the unique positive-definite square root of $\Sigma$ that commutes with $\Sigma$. Then $Z \sim \mathcal{N}(0, I)$. Now define $A, B, Q$ as

$$ A = P\Sigma^{1/2}, \qquad B = A^{T}(AA^{T})^{-1}, \qquad Q = BA = A^{T}(AA^{T})^{-1}A. $$

and notice that

  1. $Q$ is the orthogonal projection onto $\ker(A)^{\perp}$,
  2. $I-Q$ is the orthogonal projection onto $\ker(A)$.

Now decomposing $Z$ into the sum of $Z_{\perp} = QZ$ and $Z_{||} = (I-Q)Z$, they are uncorrelated normal vectors and hence independent. From this, the conditioning equation $q = PW$ becomes

$$ q = PW = P\mu + A Z = P\mu + A Z_{\perp}. $$

Multiplying $B$ to both sides and using $Q^2 = Q$ (which follows from the fact that $Q$ is an orthogonal projection), we obtain

$$ B(q-P\mu) = BAZ_{\perp} = Q^2 Z = Q Z = Z_{\perp}$$

and hence the condition $PW = q$ determines the value of $Z_{\perp}$. So

\begin{align*} (W \mid PW=q) &\stackrel{d}{=} (\mu + \Sigma^{1/2}(Z_{\perp} + Z_{||}) \mid Z_{\perp} = B(q-P\mu)) \\ &\stackrel{d}{=} \mu + \Sigma^{1/2}B(q-P\mu) + \Sigma^{1/2}(1-Q)Z. \tag{*} \end{align*}

The last line $\text{(*)}$ has several implications:

  1. $\text{(*)}$ is an affine transformation of $Z \sim \mathcal{N}(0, I)$, hence it is again normal with

    $$ (W \mid PW=q) \sim \mathcal{N}( \mu + \Sigma^{1/2}B(q-P\mu), \Sigma^{1/2}(1-Q)\Sigma^{1/2}). $$

    Plugging all the definitions, mean of the conditional distribution $(W \mid PW=q)$ simplifies to

    \begin{align*} \mathbb{E}[W \mid PW=q] &= \mu + \Sigma^{1/2}B(q-P\mu) \\ &= \mu + \Sigma P^{T} (P\Sigma P^{T})^{-1}(q-P\mu). \end{align*}

  2. If we write $S = \Sigma^{1/2}B = \Sigma P^{T} (P\Sigma P^{T})^{-1}$, then $\text{(*)}$ can be simplified to a formula which involves only known variables:

    $$ (W \mid PW=q) \stackrel{d}{=} Sq + (I - SP) W. $$