Why are $X=\mathcal N(0,1)$ and $|X|$ uncorrelated?

1k Views Asked by At

I am studying a lesson called "Multivariate Statistics". In class, a question is raised, and I cannot solve it. The question is described below:

There are two random variables: $X$ and $Y$, $Y=|X|$. $X$ obeys the standard normal distribution $\mathcal N(0,1)$.

Question: Why are $X$ and $Y$ uncorrelated?

My thought is:

  1. To demonstrate this proposition, I should prove that the covariance between $X$ and $Y$ is 0, i.e. $\text{Cov}(X,Y)=0$.
  2. Because $\text{Cov}(X,Y)=E[[X-E[X]][Y-EY]]$ and $E[X]=0$, $\text{Cov}(X,Y)=E[XY-X\cdot E[Y]]=E[XY]-E[X\cdot E[Y]]=E[XY]-E[X]E[Y]=E[XY]$. Then I should prove that $E[XY]=0$.
  3. Denote the probability density function of $XY$ as $f(x,y)$, then $$E[XY]=\iint xyf(x,y)\,dx\,dy$$ But I cannot figure it out. What should I do?
3

There are 3 best solutions below

1
On BEST ANSWER

I agree that the goal is to prove $\mathbb{E}[X\cdot|X|]=0$ and I also think that a reasonable way of doing so is to look at the corresponding integral.

However, you do not need the density function of $X\cdot |X|$, but only the "law of the unconscious statistician": Let $f$ be the density function of $\mathcal{N}(0,1)$. Then

\begin{align}\mathbb{E}[X\cdot |X|]&=\int_{\mathbb{R}}x\cdot |x|f(x)~\mathrm{d}x\\ &=-\int_{-\infty}^0 x^2f(x)~\mathrm{d}x+\int_{0}^{\infty} x^2f(x)~\mathrm{d}x\\ &=-\int_{0}^\infty x^2f(x)~\mathrm{d}x+\int_{0}^{\infty} x^2f(x)~\mathrm{d}x\\ &=0,\\ ~\end{align}

since the function $x^2f(x)$ is symmetric. You see that the symmetry is the only property of the Gaussian distribution we need here (apart from existence requirements).

4
On

Let $R$ have Rademacher distribution and let $R$ and $X$ be independent.

Then $(R|X|,|X|)$ and $(X,|X|)$ will have the same joint distribution.

Observe that $\mathbb E\left[R|X|\cdot|X|\right]=\mathbb ER\cdot\mathbb EX^2=0\cdot1=0$

1
On

Intuitively: imagine taking a bunch of samples and plotting them on a scatter plot. For each sample $(x,y),$ either $x=y$ or $x=-y$ and these happen with equal probability. So your graph looks like a big X shape. You get one line of positive correlation and one of negative correlation and the two cancel each other out to give no correlation. Remember that measuring correlation in this way is about fitting your data to a line and as these samples do not fit on a line in any reasonable way they do not correlate.