Why does $\mathbb{E}[X^2]=\int_{-\infty}^{\infty} x^2 f(x) d x$ follow from the definition of $\mathbb{E}[X]$ for continuous random variables?

73 Views Asked by At

I have learnt that the expectation of a continuous random variable, $X$, is $\mathbb{E}[X]=\int_{-\infty}^{\infty} x f(x) d x$ where $f(x)$ is the p.d.f of $X$. Why does it follow from this definition that $\mathbb{E}[X^2]=\int_{-\infty}^{\infty} x^2 f(x) d x$ where $f(x)$ is the p.d.f of $X^2$? I am not understanding how we choose $x^2$ to be a factor in the integrand, or how generally such a choice is made.

For context, after defining the expectation of a continuous random variable, my textbook defines variance which uses this fact :

Definition 21.3 (Variance). The variance of a continuous r.v. $X$ with probability density function $f$ is $$ \operatorname{Var}(X)=\mathbb{E}\left[(X-\mathbb{E}[X])^2\right]=\mathbb{E}\left[X^2\right]-\mathbb{E}[X]^2=\int_{-\infty}^{\infty} x^2 f(x) d x-\left(\int_{-\infty}^{\infty} x f(x) d x\right)^2 $$

1

There are 1 best solutions below

0
On BEST ANSWER

LOTUS (law of the unconscious statistician) tells that, for any function so that $g(X)$ has expectation, we can write

$$ \mathbf{E}[g(X)] = \int_{-\infty}^{\infty} g(x) f(x) \, \mathrm{d}x $$

where $f(x)$ is the density of $X$. (Note that $f$ here is not the density of $g(X)$, but that of $X$.) Mathematically, this is more or less a change-of-variable formula that manifests in a more abstract setting than calculus.

Although a fully rigorous and general proof of this requires some abstract measure theory, we can argue this at least heuristically as follows:

  1. If $g(x)$ is an indicator function of the form $\mathbf{1}_{B}(x) = \begin{cases} 1, & x \in B \\ 0, & x \notin B \end{cases}$, then $\mathbf{1}_{B}(X)$ is a Bernoulli random variable with success probability $p = \mathbf{P}(X \in B)$, hence

    $$ \mathbf{E}[\mathbf{1}_{B}(X)] = \mathbf{P}(X \in B) = \int_B f(x) \, \mathrm{d}x = \int_{-\infty}^{\infty} \mathbf{1}_{B}(x) f(x) \, \mathrm{d}x. $$

  2. Now if $g$ is "nice", then $g$ can be approximated by a sum of indicator functions: for large $n$, $g$ is nearly constant on each interval $(\frac{k}{n}, \frac{k+1}{n}]$ with a value $g(\frac{k}{n})$, hence

    $$ g(x) \approx \sum_{k} g(\tfrac{k}{n}) \mathbf{1}_{(\frac{k}{n}, \frac{k+1}{n}]}(x). $$

    From this, we obtain the following approximate relation

    \begin{align*} \mathbf{E}[g(X)] &\approx \mathbf{E}\biggl[ \sum_{k} g(\tfrac{k}{n}) \mathbf{1}_{(\frac{k}{n}, \frac{k+1}{n}]}(X) \biggr] \\ &= \sum_{k} g(\tfrac{k}{n}) \mathbf{E}[\mathbf{1}_{(\frac{k}{n}, \frac{k+1}{n}]}(X) ] \\ &= \sum_{k} g(\tfrac{k}{n}) \int_{-\infty}^{\infty} \mathbf{1}_{(\frac{k}{n}, \frac{k+1}{n}]}(x) f(x) \, \mathrm{d}x \\ &= \int_{-\infty}^{\infty} \biggl( \sum_{k} g(\tfrac{k}{n}) \mathbf{1}_{(\frac{k}{n}, \frac{k+1}{n}]}(x) \biggr) f(x) \, \mathrm{d}x \\ &\approx \int_{-\infty}^{\infty} g(x) f(x) \, \mathrm{d}x. \end{align*}

    As $n \to \infty$ we expect this relation to become an exact equality.