Proof of the affine property of normal distribution for a landscape matrix

8.1k Views Asked by At

The widely used/mentioned/assumed affine property of multivariate normal distributions says that:

Given a random vector $x \in R^N$ with a multivariate normal distribution -- $x \sim N_x(\mu_x, \Sigma_x)$ -- then the random vector $y = Ax + b$ obtained by applying an affine/linear transformation to $x$ also has a normal distribution --> $y \sim N_y(A\mu_x+b, A\Sigma_x A^T)$

The above property is easy to prove if $A$ is an $N \times N$ matrix by writing $x = A^{-1}(y-b)$ and substituting it into $N_x(\mu, \Sigma)$ as shown below:

\begin{aligned} p_y(y) & \propto p_x(A^{-1}(y-b)) \\ & \propto exp\{-0.5 \times (A^{-1}(y-b)-\mu_x)^T\Sigma_x^{-1}(A^{-1}(y-b)-\mu_x)\}\\ & = exp\{-0.5 \times (y - (A\mu_x + b))^T A^{-T}\Sigma_x ^{-1}A(y - (A\mu_x + b))\}\\ & = exp\{-0.5 \times (y - (A\mu_x + b))^T (A \Sigma_x A^T)^{-1}(y - (A\mu_x + b))\}\\ &\sim N_y(A\mu_x+b, A \Sigma_x A^T) \end{aligned}

My questions are the following:

  1. Does the affine propert hold true even if A is a landscape $M \times N$ matrix with $M < N$ ? (most textbooks/lecture-notes say so and many papers assume this before deriving other things)
  2. If the affine property is true, how do you prove it? because when A is a landscape $M \times N$ matrix with $M < N$ you cannot compute A^{-1} and hence you cannot express the random vector $x$ as $x = A^{-1}(y-b)$