What is the intuition behind linearity of expectations not requiring independence?

858 Views Asked by At

I am confused as to the intuition behind the linearity of expectations not requiring events to be independent. Why is this true? I read that since the proof that shows expected values are linear does not use anything regarding independence, independence is not a requirement. I don't quite follow that step. Why would we not need to show that both independent and dependent events have this property?

This also leaves me confused with questions regarding this property. For example, Suppose you toss a fair coin 12 times resulting in a sequence of heads (H) and tails (T). Let N be the number of times that the sequence HTHT appears. For example, HTHT appears twice in HTHTHTTTTTTT. Find E(N) The answer to this problem is 9/16, which comes from the fact that there is a 1/16 probability that HTHToccurs, starting at index n, with 1 <= n <= 9, and the answer is 9 * 1/16.

Why is it that we can add the probability that the string HTHT occurs starting at any index? I ask this because say HTHT were to appear in the first four flips, then the probability that HTHT occurs starting at the second index is zero because T was the outcome of the second index.

An explanation of the intuition of this property would be appreciated.

3

There are 3 best solutions below

0
On BEST ANSWER

Here's an intuitive argument. Imagine that you repeat your random experiment $N$ times, each time observing a new value of a random variable $X$ and a new value of a random variable $Y$. Let's denote the observed values of $X$ and $Y$ as $X_1, \ldots, X_N$ and $Y_1, \ldots, Y_N$. If $N$ is large, then $$ \tag{1} E(X) \approx \frac{1}{N} \sum_{i=1}^N X_i \quad \text{and} \quad E(Y) \approx \frac{1}{N} \sum_{i=1}^N Y_i. $$ But $$ \tag{2} E(X+Y) \approx \frac{1}{N} \sum_{i=1}^N X_i + Y_i. $$ Comparing equations (1) and (2) shows that $E(X+Y) \approx E(X) + E(Y)$. And we can make the approximation as good as we like by taking $N$ to be sufficiently large. So we conclude that $E(X+Y) = E(X) + E(Y)$. Notice that in this argument we never assumed $X$ and $Y$ are independent.

3
On

It's because summation and integration are linear operations: $$ \sum_j (a x_j + b y_j) = a \sum_j x_j + b \sum_j y_j$$ $$ \int (a f(x) + b g(x))\; dx = a \int f(x)\; dx + b \int g(x)\; dx$$ and expected value is defined by an integral (or a sum for the discrete case).

0
On

if we bury a lot machinery, I'd suggest the intuition comes from (i) linearity and (ii) our ability to do conditioning

(i) tells us
$E\Big[y + X\Big]= E\Big[y\Big]+ E\Big[X\Big]=y+ E\Big[X\Big] $

(ii) tells us
$E\Big[Y + X\Big] $
$= E\Big[E\big[Y + X\big \vert Y=y\big] \Big] $
$= E\Big[E\big[y + X\big \vert Y=y\big] \Big] $
$= E\Big[E\big[y\big \vert Y=y\big] +E\big[ X\big \vert Y=y\big] \Big]$
$= E\Big[E\big[y\big \vert Y=y\big] \Big] +E\Big[E\big[ X\big \vert Y=y\big] \Big]$
$= E\Big[E\big[Y\big \vert Y=y\big] \Big] +E\Big[E\big[ X\big \vert Y=y\big] \Big]$
$ =E\Big[Y\Big] + E\Big[X\Big]$
where (i) is applied on 4th and 3rd to last lines, because when the r.v. $Y$ is conditioned to equal some scalar $y$, it behaves just like the case of (i)