Write random variable vector in the form $\vec{X} = \mu +B\vec{Z}$

43 Views Asked by At

I have a random column vector $\vec{X}$ of three independent random variables $X_1, X_2, X_3$ with normal distribution (and known mean and variance). I need to write $\vec{X}$ in the form $\vec{X} = \mu +B\vec{Z}$.

How do I find $B$? I'm guessing it has something to do with the variance.

2

There are 2 best solutions below

0
On

Let $X_i \sim \mathcal{N}(\mu_i, \sigma_i)$ for $i=1,2,3$. You can generate these random variables from three i.i.d. standard normal random variables. So, letting $Z_i \sim \mathcal{N}(0,1)$ for $i=1,2,3$ i.i.d., you have \begin{align*} \begin{bmatrix}X_1 \\ X_2 \\ X_3 \end{bmatrix} = \begin{bmatrix}\mu_1 \\ \mu_2 \\ \mu_3 \end{bmatrix} + \begin{bmatrix}\sigma_1 & 0 & 0 \\ 0 & \sigma_2 & 0 \\ 0 & 0 & \sigma_3 \end{bmatrix} \begin{bmatrix}Z_1 \\ Z_2 \\ Z_3 \end{bmatrix} \end{align*}

You can read more about it here. The high-level intuition is that the $\sigma$ factors stretch the distribution and adapt the variance and the $\mu$ offset shifts it and centers it at the desired mean.

0
On

Notation: matrices are bolded (like ${\bf B}$) and column vectors have an arrow (like $\vec{a}$).

You have stated the problem assuming $\vec{X} = (X_1,X_2,X_3)$ are independent normal random variables. I will first solve a more general problem: For $\vec{X}\sim N(\vec{\mu},{\bf\Sigma})$ with arbitrary mean $\vec{\mu}$ and covariance ${\bf\Sigma}$, show that ${\bf X} = \vec{a}+{\bf B}\vec{Z}$ where $\vec{Z}$ are IID standard normals and where $\vec{a}$ and ${\bf B}$ are constants. Then I will show the result for your case of independence. Note that $\vec{Z}$ being IID standard normals means that $\vec{Z}\sim N(\vec{0},{\bf I})$ where $\vec{0}$ is a column vector of zeros and ${\bf I}$ is the identity matrix.

${\bf \Sigma}$ is symmetric: By definition, ${\bf \Sigma}=Cov(\vec{X},\vec{X})$ and so the $(i,j)$ entry of ${\bf \Sigma}$ is $\Sigma_{ij}=Cov(X_i,X_j)=Cov(X_j,X_i)=\Sigma_{ji}$, showing symmetry.

${\bf \Sigma}$ is non-negative definite: By definition, we have $E[\vec{X}]=\vec{\mu}$. And, we have for any $\vec{m}$: \begin{eqnarray*} 0 \leq Var[\vec{m}^{\top}\vec{X}] &=& E\Big[\Big(\vec{m}^{\top}\vec{X}-\vec{m}^{\top}E[\vec{X}]\Big)^2\Big] \\ &=& E\Big[\Big(\vec{m}^{\top}\vec{X}-\vec{m}^{\top}E[\vec{X}]\Big)\Big(\vec{m}^{\top}\vec{X}-\vec{m}^{\top}E[\vec{X}]\Big)^{\top}\Big]\\ &=& E\Big[\Big(\vec{m}^{\top}\vec{X}-\vec{m}^{\top}E[\vec{X}]\Big)\Big(\vec{X}^{\top}\vec{m}-E[\vec{X}^{\top}]\vec{m}\Big)\Big]\\ &=& E\Big[\vec{m}^{\top}\vec{X}\vec{X}^{\top}\vec{m} - \vec{m}^{\top}\vec{X}E[\vec{X}^{\top}]\vec{m} - \vec{m}^{\top}E[\vec{X}]\vec{X}^{\top}\vec{m} + \vec{m}^{\top}E[\vec{X}]E[\vec{X}^{\top}]\vec{m}\Big]\\ &=& \vec{m}^{\top}E[\vec{X}\vec{X}^{\top}]\vec{m} - \vec{m}^{\top}E[\vec{X}]E[\vec{X}^{\top}]\vec{m} - \vec{m}^{\top}E[\vec{X}]E[\vec{X}^{\top}]\vec{m} + \vec{m}^{\top}E[\vec{X}]E[\vec{X}^{\top}]\vec{m}\\ &=& \vec{m}^{\top}E[\vec{X}\vec{X}^{\top}]\vec{m} - \vec{m}^{\top}E[\vec{X}]E[\vec{X}^{\top}]\vec{m}\\ &=& \vec{m}^{\top}\Big(E[\vec{X}\vec{X}^{\top}] - E[\vec{X}]E[\vec{X}^{\top}]\Big)\vec{m}\\ &=& \vec{m}^{\top}{\bf \Sigma}\vec{m} \end{eqnarray*} So, for all $\vec{m}$ we have $\vec{m}^{\top}{\bf \Sigma}\vec{m}\geq 0$, showing ${\bf\Sigma}$ is non-negative definite.

Now, by a standard result in linear algebra, since ${\bf\Sigma}$ is non-negative definite and symmetric we can write ${\bf\Sigma}={\bf P}{\bf\Lambda}{\bf P}^{\top}$ where ${\bf P}^{\top}={\bf P}^{-1}$ and ${\bf\Lambda}=diag(\lambda_1,\dots,\lambda_n)$ is a diagonal matrix with the non-negative eigenvalues $\lambda_i$ of ${\bf\Sigma}$ on the diagonal and zeros elsewhere. Moreover, if we have the eigenvectors $\vec{p}_i$ satisfying ${\bf\Sigma}\vec{p}_i=\lambda_i\vec{p}_i$ then ${\bf P}$ can be constructed using the $\vec{p}_i$ as the columns of ${\bf P}$.

Next, define ${\bf \Lambda}^{\frac{1}{2}}= diag(\lambda_1^{\frac{1}{2}},\dots,\lambda_n^{\frac{1}{2}})$. Also, define ${\bf\Sigma}^{\frac{1}{2}}={\bf P}{\bf \Lambda}^{\frac{1}{2}}{\bf P}^{\top}$. Note that we have $\Big({\bf\Sigma}^{\frac{1}{2}}\Big)^{\top}={\bf\Sigma}^{\frac{1}{2}}$ and ${\bf\Sigma}^{\frac{1}{2}}{\bf\Sigma}^{\frac{1}{2}}={\bf P}{\bf \Lambda}^{\frac{1}{2}}{\bf P}^{\top}{\bf P}{\bf \Lambda}^{\frac{1}{2}}{\bf P}^{\top}={\bf P}{\bf \Lambda}^{\frac{1}{2}}{\bf \Lambda}^{\frac{1}{2}}{\bf P}^{\top}={\bf P}{\bf\Lambda}{\bf P}^{\top}={\bf\Sigma}$

Now, if $\vec{X}=\vec{a}+{\bf B}\vec{Z}$ for some $\vec{a}$ and ${\bf B}$ then we have $$ \vec{\mu} = E[\vec{X}]=E[\vec{a}+{\bf B}\vec{Z}]=\vec{a}+{\bf B}E[\vec{Z}]=\vec{a}+{\bf B}\vec{0}=\vec{a} $$ and $$ {\bf\Sigma} = Cov(\vec{X},\vec{X}) = E\Big[\Big(\vec{X}-\vec{a}\Big)\Big(\vec{X}-\vec{a}\Big)^{\top}\Big] =E\Big[\Big({\bf B}\vec{Z}\Big)\Big({\bf B}\vec{Z}\Big)^{\top}\Big] =E\Big[{\bf B}\vec{Z}\vec{Z}^{\top}{\bf B}^{\top}\Big] ={\bf B}E\Big[\vec{Z}\vec{Z}^{\top}\Big]{\bf B}^{\top} ={\bf B}{\bf I}{\bf B}^{\top} ={\bf B}{\bf B}^{\top} $$ And, a solution to ${\bf\Sigma}={\bf B}{\bf B}^{\top}$ is to take ${\bf B}={\bf\Sigma}^{\frac{1}{2}}$. Thus $$ \vec{X} = \vec{\mu}+{\bf\Sigma}^{\frac{1}{2}}\vec{Z} $$ gives us the mean and covariance we are after. But, we need to verify that $\vec{Z}$ is $N(\vec{0},{\bf I})$. To do this, we look at $\vec{Z}={\bf\Sigma}^{-\frac{1}{2}}(\vec{X}-\vec{\mu})$ where ${\bf\Sigma}^{-\frac{1}{2}}$ is defined to be $({\bf\Sigma}^{\frac{1}{2}})^{-1} =({\bf P}{\bf\Lambda}^{\frac{1}{2}}{\bf P}^{\top})^{-1} ={\bf P}({\bf\Lambda}^{\frac{1}{2}})^{-1}{\bf P}^{\top}={\bf P}{\bf\Lambda}^{-\frac{1}{2}}{\bf P}^{\top}$ where ${\bf \Lambda}^{-\frac{1}{2}}=diag(\lambda_1^{-\frac{1}{2}},\dots,\lambda_n^{-\frac{1}{2}})$. We know that the characteristic function of $X$ is $\phi_X(\vec{t}) =\exp\left[i\vec{t}^{\top}\vec{\mu}-\frac{1}{2}\vec{t}^\top{\bf\Sigma}\vec{t}\right]$. We now compute the characteristic function of $\vec{Z}$: \begin{eqnarray*} \phi_Z(\vec{t}) &=& E\left[\exp\left(i\vec{t}^{\top}\vec{Z}\right)\right]\\ &=& E\left[\exp\left(i\vec{t}^{\top}{\bf\Sigma}^{-\frac{1}{2}}(\vec{X}-\vec{\mu})\right)\right]\\ &=& E\left[\exp\left(i({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}(\vec{X}-\vec{\mu})\right)\right]\\ &=& E\left[\exp\left(i({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}\vec{X} \right)\right]\exp\left(-i ({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}\vec{\mu}\right)\\ &=& \phi_X\left({\bf\Sigma}^{-\frac{1}{2}}\vec{t}\right) \exp\left(-i ({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}\vec{\mu}\right)\\ &=& \exp\left(i({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}\vec{\mu} -\frac{1}{2}({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}{\bf\Sigma} ({\bf\Sigma}^{-\frac{1}{2}}\vec{t}) \right) \exp\left(-i({\bf\Sigma}^{-\frac{1}{2}}\vec{t})^{\top}\vec{\mu}\right)\\ &=& \exp\left(-\frac{1}{2}\vec{t}^{\top}{\bf I}\vec{t}\right) \end{eqnarray*} which is the characteristic function of a $N(\vec{0},{\bf I})$ random variable, hence $Z \sim N(\vec{0},{\bf I})$ and we are done.

You have $\vec{\mu}$ so all you need to do is (i) find the eigenvalues and eigenvectors of ${\bf\Sigma}$ (ii) construct ${\bf P}$ using the eigenvectors as the columns (iii) construct ${\bf\Lambda}$ (iv) compute ${\bf\Lambda}^{\frac{1}{2}}$ and (v) compute ${\bf\Sigma}^{\frac{1}{2}}={\bf P}{\bf\Lambda}^{\frac{1}{2}}{\bf P}^{\top}$.

Special case:

Suppose $X_i\sim N(\mu_i,\sigma_i^2)$ for $i=1,2,3$. And suppose they are independent. Then, $\vec{X}=(X_1,X_2,X_3)$ has mean $\vec{\mu}=(\mu_1,\mu_2,\mu_3)^{\top}$. The covariance matrix has $(i,j)$ entries $Cov(X_i,X_j)=0$ when $i\neq j$ and $Cov(X_i,X_i)=\sigma_i^2$. So, the covariance matrix is already diagonal and we just can take ${\bf P}={\bf I}$ and ${\bf\Sigma}={\bf\Lambda}=diag(\sigma_1^2,\sigma_2^2,\sigma_3^2)$ so that ${\bf\Sigma}^{\frac{1}{2}}=diag(\sigma_1,\sigma_2,\sigma_3)$. Thus, $$ \vec{X} = \vec{\mu}+{\bf\Sigma}^{\frac{1}{2}}\vec{Z} $$ which can be written more explicitly as $$ \begin{bmatrix} X_1\\ X_2\\ X_3 \end{bmatrix} = \begin{bmatrix} \mu_1\\ \mu_2 \\ \mu_3 \end{bmatrix} + \begin{bmatrix} \sigma_1 & 0 & 0\\ 0 & \sigma_2 & 0\\ 0 & 0 & \sigma_3 \end{bmatrix} \begin{bmatrix} Z_1\\ Z_2\\ Z_3 \end{bmatrix} $$