Distributing expectation over a quadratic function

1.6k Views Asked by At

I saw this proof in MIT Probability Courseware :

enter image description here

I understand the linearity of expectation and went through the proof of it as well. But how is the Expectation distributed over a quadratic function here in the second step of the proof ?


To clarify, I want to understand what allows us to distribute

\begin{align}E(X^2 + a)&=E(X^2) + E(a)\end{align}

This is not linear in X, so linearity of expectation shouldn't hold ?!


Further clarification :

I am also told in the course (Slide 2), to not assume :

E[g(X)] = g(E[X]) to be true in general.

If I could do change of variables like suggested by some answers, the above can always be made to be true ?

3

There are 3 best solutions below

2
On

Remember for a constant, $E(kX)=kE(X)$, $-2\mu$ and $\mu^2$ ara constants.

\begin{align}E(X^2-2\mu X+\mu^2)&=E(X^2)-E(2\mu X)+E(\mu^2)\\&=E(X^2)-2\mu E( X)+\mu^2 E(1)\\&=E(X^2)-2\mu E( X)+\mu^2 \end{align}

Edit:

Let $Y=X^2$, then $Y$ is a random variable. hence the problem becomes $E(Y-2\mu X+\mu^2)$.

Edit $2$:

If you let $Y=g(X)$, $E(g(X))=E(Y)$, the $g$ doesn't get out from the expectation in general, in the context of $g(X)=X^2$, note that the quadratic stays inside the expectation term. In general $E(X^2) \ne E(X)^2$.

0
On

The expected value of the sum of random variables is the sum of their expected values. This is true for any two random variables for which expected values exist, not just for a single variable and linear functions of itself.

That is, in general, if $Y$ and $Z$ are random variables and if $E(Y)$ and $E(Z)$ both exist, $$ E (Y + Z) = E(Y) + E(Z). $$

But if $X$ is a random variable, then $Y = X^2$ is a random variable and so is $Z = -2\mu X + \mu^2.$ Moreover, $Y + Z = X^2 - 2\mu X + \mu^2.$ Therefore \begin{align} E(X^2 - 2\mu X + \mu^2) &= E(Y + Z)\\ &= E(Y) + E(Z)\\ &= E(X^2) + E(- 2\mu X + \mu^2). \end{align}

I think you can work out the rest of it.

0
On

I think your question has to do with this step

$$E((X-\mu)^{2}) = E(X^{2} -2\mu X + \mu^{2}) $$ $$ = E(X^{2}) - 2\mu E(X) + \mu^{2}$$

$$E(X^{2} -2\mu X + \mu^{2}) = E(X^{2}) -E(2\mu X) +E(\mu^{2})$$ $$ E(X^{2}) -2\mu E(X) + E(\mu^{2}) $$

you should note that $\mu = E(X)$ and is simply a constant so we can pull it out.

Then we have, if substitute $$E(X)^{2} -2 \mu \mu + \mu^{2} = E(X^{2}) - 2 \mu^{2}+ \mu^{2} = E(X^{2}) - \mu^{2}$$

It may also be useful to remember what the expectation is

For discrete variables it is

$$ E(X) = \sum_{i} x_{i} p(x_{i}) $$ for continuous random variables we have $$ E(X) = \int_{-\infty}^{\infty} x f(x) dx $$

so your question is about linearity

$ \mu $ is a constant consider why this works with summations and integrals