Why is it bad to multiply two expectations of the same variable?

98 Views Asked by At

In Sutton & Barto's book: Reinforcement Learning (chapter 11.5) they say that it is bad to multiply two expectations of the same variable, as otherwise the sample of the product will be biased.

Why is that the case?

Excerpt from the book: (part about bias missing)

enter image description here

1

There are 1 best solutions below

0
On BEST ANSWER

Your question is more general than the excerpt from the book. If I understand it correctly, "it is bad to multiply two expectations of the same variable, as otherwise the sample of the product [of the variable with itself] will be biased."

This amounts to saying $E[x^2] \ne E[x] \cdot E[x]$. Now this is indeed the case in many situations. Consider that $x$ is normally distributed with mean $\mu$ and variance $\sigma^2$, then $E[x^2] = \mu^2 + \sigma^2 \ne E[x] \cdot E[x] = \mu^2$. So indeed, there is a constant bias by $\sigma^2$.

Is it that what you were asking?