I am interested to understand what's the meaning of using powers in expected value. Both mathematically and maybe intuitively, what does it even mean "the power of a random variable"?
Thanks!
I am interested to understand what's the meaning of using powers in expected value. Both mathematically and maybe intuitively, what does it even mean "the power of a random variable"?
Thanks!
Here's a simple example in the discrete case.
Suppose we have a random variable $X$ for which $$\Pr[X = -1] = 1/2, \\ \Pr[X = 1] = 1/3, \\ \Pr[X = 2] = 1/6.$$ Think of the set $\{-1, 1, 2\}$ as the set of possible outcomes of $X$, and the probability of observing one of these outcomes is described as above.
Now, if I want to ascertain the expected value of $X$, which in the frequentist viewpoint is in a sense the "long-run average" of many observations of $X$, then $$\operatorname{E}[X] = (-1)\Pr[X = -1] + (1)\Pr[X = 1] + (2)\Pr[X = 2] = -\frac{1}{2} + \frac{1}{3} + \frac{2}{6} = \frac{1}{6}.$$ Note that the expected value need not be one of the outcomes of $X$; in fact, $\Pr[X = 1/6] = 0$ as implied by the probability distribution I've provided. What $1/6$ means here is that if we were to observe many realizations of $X$, and average them together, the result would tend toward $1/6$ as the number of observations increases.
So, what does something like $\operatorname{E}[X^2]$ mean? The expression $X^2$ is another random variable that can be calculated from $X$; it means that whatever outcome $X$ is, $X^2$ is its square. Since $X \in \{-1, 1, 2\}$, then $X^2 \in \{1, 4\}$. We get this by taking the square of each outcome, and taking all the unique values. The probability mass function of $X^2$ is $$\Pr[X^2 = 1] = \Pr[X = -1] + \Pr[X = 1] = \frac{1}{2} + \frac{1}{3} = \frac{5}{6}, \\ \Pr[X^2 = 4] = \Pr[X = 2] = \frac{1}{6}.$$ Then $$\operatorname{E}[X^2] = (1)\Pr[X^2 = 1] + (4)\Pr[X^2 = 4] = \frac{5}{6} + \frac{4}{6} = \frac{3}{2}.$$
Of course, you can similarly calculate things like $\operatorname{E}[X^3]$, $\operatorname{E}[X^4]$, and so forth.
The idea here is that $X^n$ corresponds to another random variable whose set of outcomes (its support) and probability distribution are determined by those of $X$ in a way that naturally extends from the familiar rules of algebra. You can do this for continuous random variables too; or random variables with more complicated definitions. Note that $\operatorname{E}[X]$ may be well-defined but $\operatorname{E}[X^n]$ may not for certain $n$; a trivial example is if $\Pr[X = 0] > 0$ and $n < 0$, in which case you'd have division by $0$. There also exist examples where $\operatorname{E}[X^2]$ fails to exist even though $\operatorname{E}[X]$ does.