Reconciling two interpretations of $E(X^2)$

Question

Reconciling two interpretations of $E(X^2)$

199 Views Asked by Bumbble Comm At 28 Mar 2026 - 3:00

It's been over 25 years since my last course in probability, so this may be obvious or elementary. However, I've unexpectedly had to think deeply about this stuff for a research project, and I cannot reconcile my issue.

For the sake of simplicity/common ground, let's assume Riemann integrals always suffice, there are no convergence issues, and my random variables take values over all of $\mathbb{R}$. All integrals below are $\int_\mathbb{R} \ dx $ whether or not the domain is explicitly written.

Given a r.v. $X$ with density function $f(x)$, we define $E(X) = \int x f(x) \ dx$. Linearity is then proven, $E(aX+b)=aE(X)+b$, and this is completely sensible. In deriving the alternative formula for the variance, we bump into $E(X^2)$. In that derivation, we define $$ E(X^2) = \int x^2 f(x) \ dx $$ and proceed. This is fine, but here's my issue.

$X^2$ is itself a random variable, so we could ask for its expected value (in reference to "itself," not $X$). To be clearer, we could set $Y=X^2$ and ask for $E(Y)$. This requires us to know the density function for $Y$, and this to me is not clear at all and non-trivial to get your hands on. So, setting $Y=X^2$ with density $h(y)$, why is it true that $$ \int y h(y) \ dy = \int x^2 f(x) \ dx $$
so that $E(Y) = E(X^2)$? Why do these two very different interpretations agree? I do not think this is as simple as a $u$-substitution.

For example, I can see that this works out fine if $X$ is standard normal. There, $E(X)=0$ and $Y=X^2$ is $\chi^2$ distributed with $1$ degree of freedom. Since $\sigma_X^2=1$ it is clear that $E(X^2)=1$. Chasing the calculations I also see that $E(Y)=E(\chi_1^2)=1$, so they are in fact in agreement. I can even follow the derivations of the $\chi_1^2$ distribution in terms of the $\Gamma$-function and see the connection to the standard normal, but I see no reason for this to play out as nicely no matter the density of $X$. It also seems to get worse when considering $E(X^\alpha)$ in general.

Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 17 Dec 2018 - 4:08

It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):

Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $\sigma$-algebra, and a probability measure there.

A random variable is a function "that provides a numerical measurement" to each element $s\in S$. (satisfying measurabilty condition). . So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.

Expectations SHOULD NOT BE viewed as something with respect to the density of a specific random variable. It is expectation of some function on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)

**Bumbble Comm** · Accepted Answer

Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.

Let $F$ be the cdf for $X$, so that the probability that $a \le X \le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?

Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Y\le y)=P(X^2\le y)$. Thus if $y\le 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 \le y) = P(-\sqrt{y}\le x \le \sqrt{y})=F(\sqrt{y})-F(-\sqrt{y})$.

Taking the derivative, we see that $$h(y) = \begin{cases} 0 & y \le 0 \\ \frac{f(\sqrt{y})+f(-\sqrt{y})}{2\sqrt{y}} & y > 0\end{cases}.$$

Then $$\int y h(y) \,dy = \int_0^\infty y(f(\sqrt{y})+f(-\sqrt{y}))\frac{1}{2\sqrt{y}}\,dy.$$ Letting $u=\sqrt{y}$, we have $du=\frac{1}{2\sqrt{y}}\,dy$ so $$\int y h(y) \,dy = \int_0^\infty u^2 (f(u)+f(-u))\,du=\int_{-\infty}^\infty u^2 f(u)\,du=E[X^2]$$

And the abstract approach

Let $(A,\Omega,\mu)$ be a probability space. A random variable on $A$ is a measurable function $f:A\to \Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $a\mapsto f(a)^2$, the composite of $x\mapsto x^2$ with $f$. However, let's be a little more general. Let $g:\Bbb{R}\to \Bbb{R}$ be any continuous function on $\Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $g\circ f : A\to \Bbb{R}$.

Now we can take the pushforward $\mu$ along $f$ to get a measure $f_*\mu$ on $\Bbb{R}$ defined by $f_*\mu(E) = \mu(f^{-1}(E))$. If $f_*\mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.

We now have a new measure space, $(\Bbb{R},\mathcal{B},f_*\mu)$. Since we obtained this space by pushing forwards $\mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $B\subseteq\Bbb{R}$, by definition originally was $\mu(f^{-1}(B)$, but this is $f_*\mu(\mathrm{id}^{-1}(B))$.

Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:\Bbb{R}\to\Bbb{R}$.

The expected value of $Y$ is then $$\int g(x)\,f_*\mu(dx),$$ however, we can then pushforward $f_*\mu$ along $g$ to get $g_*f_*\mu$. This gives that the expected value of $Y$ is $$\int y\, g_*f_*\mu(dy).$$

Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e. $$\int g(x)\,f_*\mu(dx)=\int y\,g_*f_*\mu(dy).$$

This is however, just a special case of change of variables, which says that $$\int j \circ k \,d\nu = \int j \,d k_*\nu,$$ with $j=\textrm{id}$, $k=g$, and $\nu=f_*\mu$.

Reconciling two interpretations of $E(X^2)$

There are 2 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in EXPECTED-VALUE

Trending Questions

Popular # Hahtags

Popular Questions