I was studying factor analysis model using a lecture note by Prof. Andrew Ng (http://cs229.stanford.edu/notes/cs229-notes9.pdf).
It says
$z \sim N(0,I) \\ \epsilon \sim N(0, \psi) \\ x = \mu + \Lambda z + \epsilon$
where $z \in \mathbf{R}^k, \mu \in \mathbf{R}^n, \Lambda \in \mathbf{R}^{b * k}, \psi \in \mathbf{R}^{n * n}$.
$z$ is the factor vector drawn in the latent space and $x$ is a linear combination of $z$ with Gaussian noise.
The lecture note then says that their joint distribution is also a Gaussian. That is,
$\begin{pmatrix} z\\x\end{pmatrix} \sim N(\mu_{zx}, \Sigma)$
Can somebody tell me why the joint distribution of two normal is also Gaussian? Does it generally hold? Or, does it hold only for a special case?
It is not generally true that if two or more random variables are separately (or "marginally") normally distributed, then they are jointly normally distributed.
For example, suppose $X\sim N(0,1)$ and $$ Y = \begin{cases} \phantom{-}X & \text{if }|X|<1, \\ -X & \text{if }|X|\ge 1. \end{cases} $$ Then $Y\sim N(0,1)$ as well, but the distribution of the pair $(X,Y)$ is not a $2$-dimensional normal distribution. Notice that the pair is constrained to lie within a union of three line segments that are not all colinear. But a $2$-dimensional normal distribution is constrained to lie on a line only when its variance is a singular matrix, and otherwise is supported on the entire plane.