The intuition behind the chi-square distribution or the square of a random variable

189 Views Asked by At

Lets call X~N(0,1)

I can't understand why squaring normal random variable outputs a chi-square variable with one degree of freedom.

The problem might be I ignore the intuition behind de the multiplication of two dependent random variables (X*X, X time its self).

I tried to find that intuition programming with python but I can't obtain the chi-squared pdf by squaring the N(0,1) pdf

import numpy as np

import matplotlib.pyplot as plt

import scipy.stats as stats



domain = np.arange(-10, 10, 0.001)

X = stats.norm.pdf(domain, 0, 1)



plt.plot(domain**2, X)

plt.plot(domain, X**2)

plt.plot(domain**2, X**2)



plt.show()

I get to think I don’t understand the essence of random variable. If anyone could help me I’d appreciate it. Thanks in advance.

4

There are 4 best solutions below

0
On

Let $P(x)$ be the probability distribution of a random variable $x$. This means that the probability that $x$ lies in the interval $(a,b)$ is equal to $\int_a^bP(x)dx$. Ok?

Now let $y(x)$ be a function of $x$, and we want to know what is the probability distribution of $y$. Let $Q(y)$ be this function. Let us assume for simplicity that $y(x)$ is injective: there is only one $y$ for each $x$.

So intuitively we expect that the chance of $y$ having the value $y_0$ should be the same as the chance of $x$ having the value $x_0$ such that $y_0=y(x_0)$. Yes?

More concretely, the important identity is $$\int_a^bP(x)dx=\int_{y(a)}^{y(b)}Q(y)dy$$ Now let $b\to a$ to get $$Q(y)\lim_{b\to a}\frac{y(b)-y(a)}{b-a}=Q(y)\frac{dy}{dx}=P(x).$$

This will tell you $Q(y)$ from $P(x)$.

If $x$ has a normal distribution, what is the distribution of $y=x^2$?

0
On

I can't understand why squaring normal random variable outputs a chi-square variable with one degree of freedom

The easiest way is to compute the density of $Y=X^2$ analytically

$$y=x^2$$

$$x=\pm\sqrt{y}$$

$$F_Y(y)=F_X(\sqrt{y})-F_X(-\sqrt{y})$$

derivating...

$$f_Y(y)=f_X(\sqrt{y})\frac{1}{2\sqrt{y}}-f_X(-\sqrt{y})\left( -\frac{1}{2\sqrt{y} }\right)=\frac{1}{\sqrt{y}}f_X(\sqrt{y})$$

that is

$$f_Y(y)=\frac{1}{\sqrt{y}}\frac{1}{\sqrt{2}\cdot\sqrt{\pi}}e^{-y/2}$$

now we can rewrite $f_Y(y)$ in the following way

$$f_Y(y)=\frac{\left( \frac{1}{2} \right)^{1/2}}{\Gamma\left(\frac{1}{2} \right)}y^{1/2-1}e^{-y/2}$$

We imnmediately recognize a Gamma density that is

$$Y\sim \text{Gamma}\left( \frac{1}{2};\frac{1}{2} \right)=\chi_{(1)}^2$$

0
On

In general, for any random variable $X$, you don't get the pdf of $X^2$ by squaring the pdf of $X.$

Even more generally, if $X$ is a random variable with pdf $f_X(x)$ and $g$ is a function, you don't get the pdf of $g(X)$ by working out $g(f_X(x)).$

Example: $g(x) = x+2$ and $X$ is uniform on the interval $[0,1]$, that is, $$ f_X(x) = \begin{cases} 1 & 0 \leq x \leq 1, \\ 0 & \text{otherwise},\end{cases} $$

then whatever number in the interval $[0,1]$ turns out to be the value of $X$, the value of $g(X)$ will be that number plus $2,$ which is now a number between $2$ and $3.$ That is, $g(X)$ has uniform distribution on $[2,3]$ and its pdf $f_{g(X)}$ is

$$ f_{g(X)}(x) = \begin{cases} 1 & 2 \leq x \leq 3, \\ 0 & \text{otherwise}.\end{cases} $$

If instead we just assume the pdf of $g(X)$ is $f_{g(X)} = g(f_X(x)),$ we would get

$$ f_{g(X)}(x) = f_X(x) + 2 = \begin{cases} 3 & 0 \leq x \leq 1, \\ 2 & \text{otherwise},\end{cases} $$

which isn't even a probability distribution.

The correct way to think about computing the pdf of $g(X)$ is in the earlier answers.

0
On

$$\mathbb P(X^2<y)=\mathbb P(|X|<\sqrt y)=2\int_0^{\sqrt y}N(t;0,1)\,dt$$ which implies

$$\dfrac d{dy}\mathbb P(X^2<y)=\frac1{\sqrt y}N(\sqrt y;0,1)=\frac{e^{-y/2}}{\sqrt{2\pi y}}.$$