Probabilistic notation for the multivariate normal distributions and joint probability distribution functions

108 Views Asked by At

Suppose I have two Random Variables $X$ and $Y$:

  • X ~ $\mathcal{N}(\mu_1, \sigma_1^2)$
  • Y ~ $\mathcal{N}(\mu_2, \sigma_2^2)$

Provide that $X$ and $Y$ are independent, I have often seen these two Random Variables combined to be written as:

$\begin{bmatrix}X \ Y\end{bmatrix} \sim \mathcal{N}\left(\boldsymbol{\mu} = \begin{bmatrix}\mu_1 \ \mu_2\end{bmatrix}, \boldsymbol{\Sigma} = \begin{bmatrix}\sigma_1^2 & 0 \ 0 & \sigma_2^2\end{bmatrix}\right)$

I always thought that the above expression was referring to the Joint Probability Distribution of $X$ and $Y$, i.e. $P(XY)$. Since both of these Random Variables are independent, I should be able to multiply the Marginal Distributions of both of these variables together to obtain the Joint Probability Distribution:

$P(XY) = N(\mu_1, \sigma_1) \cdot N(\mu_2, \sigma_2) = \frac{1}{\sqrt{2\pi\sigma_1}}\exp\left(-\frac{(x-\mu_1)^2}{2\sigma_1^2}\right) \cdot \frac{1}{\sqrt{2\pi\sigma_2}}\exp\left(-\frac{(y-\mu_2)^2}{2\sigma_2^2}\right)$

And I think that this can be condensed into:

$P(XY) = \begin{bmatrix}X \ Y\end{bmatrix} \sim \mathcal{N}\left(\boldsymbol{\mu} = \begin{bmatrix}\mu_1 \ \mu_2\end{bmatrix}, \boldsymbol{\Sigma} = \begin{bmatrix}\sigma_1^2 & 0 \ 0 & \sigma_2^2\end{bmatrix}\right)$

My Question: Have I understood this correctly? When $X$ and $Y$ are independent, does the following relationship hold?

$$P(XY) = N(\mu_1, \sigma_1) \cdot N(\mu_2, \sigma_2) = \frac{1}{\sqrt{2\pi\sigma_1}}\exp\left(-\frac{(x-\mu_1)^2}{2\sigma_1^2}\right) \cdot \frac{1}{\sqrt{2\pi\sigma_2}}\exp\left(-\frac{(y-\mu_2)^2}{2\sigma_2^2}\right) = \mathcal{N}\left(\boldsymbol{\mu} = \begin{bmatrix}\mu_1 \\ \mu_2\end{bmatrix}, \boldsymbol{\Sigma} = \begin{bmatrix}\sigma_1^2 & 0 \\ 0 & \sigma_2^2\end{bmatrix}\right)$$

When both of these Random Variables are independent, can I multiply the Marginal Distributions of both of these variables together to obtain the Joint Probability Distribution?

Thanks!

1

There are 1 best solutions below

5
On

Your notation is slightly unclear, so let's clarify a few things here. I will use $N(\mu, \sigma^2)$ to represent the Normal Distribution and $\mathcal{N}($$\boldsymbol{\mu}$, $\boldsymbol{\Sigma}$) to represent the Multivariate Normal Distribution.

If $X$ ~ $N(\mu_1, \sigma_1^2)$ and is independent of $Y$ ~ $N(\mu_2, \sigma_2^2)$, then we can define their joint distribution as follows:

Joint distribution of $\bf X$ and $\bf Y$: Let $Z=(X,Y)$, then $Z$~$\mathcal{N}($$\boldsymbol{\mu}$, $\boldsymbol{\Sigma}$) where $\boldsymbol\mu = (\mu_1, \mu_2)^T$ and $\boldsymbol\Sigma = \begin{bmatrix}\sigma_1^2 & 0\\0 & \sigma_2^2\end{bmatrix}$. Here $\boldsymbol\mu$ is called the "mean vector" and $\boldsymbol\Sigma$ is the "Covariance Matrix". The non-diagonal entries of $\boldsymbol\Sigma$ represent the covariance between $X,Y$ which is $0$ by the independence assumption.

Now that we have clarified our notation, we can calculate joint density for the above distribution. Here we make use of the following theorem:

Theorem: if $X,Y$ are absolutely continuous random variables, then they are independent if and only if their probability density functions $f_X$ and $f_Y$ factorise their joint probability density function $f_Z = f_{(X,Y)}$. In other words, the random variables are independent if and only if $f_Z = f_X \cdot f_Y$

Therefore, you are correct that you can multiply the marginal densities of $X,Y$ in order to obtain the joint density function. However, it is important to note that we have the additional constraint of absolute continuity of both of the random variables. That is fine in this case, but is an important condition to check when dealing with other random variables following different distributions.

In fact, you don't need to manually compute the joint density every time for the multivariate normal distribution. You can make use of the general form for the joint density of $k$ normally distributed random variables $X_1 ... X_k$:

$$ f_{(X_1 ... X_k)}(x) = \frac{1}{\sqrt{(2 \pi)^k \det (\boldsymbol\Sigma)}} \exp \Big{(} - \frac{1}{2}(x- \boldsymbol\mu )^T \boldsymbol{\Sigma ^{-1}} (x- \boldsymbol\mu) \Big{)}$$

Which should remind you of the probability density function for the univariate normal distribution (as if $k = 1$, this formula simplifies down to this).

One additional point that is worth mentioning is your notation. Even though the general thinking behind what you have done is correct, your notation is generally considered to be bad practice.

$P(XY)$ is not commonly used to denote the joint density function of $X,Y$ as it could easily be mistaken for the density function of $XY$ where $X$ and $Y$ are being multiplied together. Equally, $P(A)$ usually denotes the probability $\mathbb{P}(A=a)$ and so it also could lead to some confusion where one may think that this is a probability when what you are really trying to define is a distribution function.

Similarly, we do not usually write $N(\mu_1, \sigma_1) \cdot N(\mu_2, \sigma_2)$ being multiplied together like this. Instead we define probability distribution functions $f_X$ and $f_Y$ (possibly using different notation, but this is a common way to denote these functions) and then multiply these together.