Plugging a random variable into a probability function?

92 Views Asked by At

I have attempted to "prove" some basic results in probability and I was wanting to know whether my "proofs" are sound without getting more involved than undergraduate multivariate calculus. I attempted to demonstrate that $\mathbb{E}(X + Y | Z) = \mathbb{E}(X|Z) + \mathbb{E}(Y|Z)$ as follows

\begin{align*} \mathbb{E}(X + Y | Z) &= \sum_{i}\sum_{j} (x_i + y_j) f_{X,Y|Z}(x_i,y_j|Z) \\ &= \sum_{i}\sum_{j} x_i f_{X,Y|Z}(x_i,y_j|Z) + \sum_{i}\sum_{j} y_j f_{X,Y|Z}(x_i,y_j|Z) \\ &= \mathbb{E}(X|Z) + \mathbb{E}(Y|Z) \end{align*}

where $X, Y$ and $Z$ are discrete random variables. I am particularly unsure about plugging a random variable into a probability mass function, should I be summing over all possible values of $Z$ instead?

1

There are 1 best solutions below

3
On BEST ANSWER

If $A\in\sigma(Z)$, then $1_A$ is $\sigma(Z)$-measurable, so by linearity of expectation and iterated expectation

\begin{align}\mathbb E[X1_A] + \mathbb E[Y1_A] &= \mathbb E[(X+Y)1_A]\\ &= \mathbb E[\mathbb E[(X+Y)1_A\mid Z]]\\ &= \mathbb E[\mathbb E[X+Y\mid Z]1_A], \end{align} and \begin{align}\mathbb E[X1_A] + \mathbb E[Y1_A] &= \mathbb E[\mathbb E[X1_A\mid Z]] + \mathbb E[\mathbb E[Y1_A\mid Z]]\\ &= \mathbb E[\mathbb E[X\mid Z]1_A] + \mathbb E[\mathbb E[Y\mid Z]1_A].\end{align}

Hence

$$\mathbb E[\mathbb E[X+Y\mid Z]1_A] = \mathbb E[\mathbb E[X\mid Z]1_A] + \mathbb E[\mathbb E[Y\mid Z]1_A], $$

and by definition of conditional expectation,

$$\mathbb E[X+Y\mid Z] = \mathbb E[X\mid Z] + \mathbb E[Y\mid Z]. $$

As @user190080 pointed out, here I used two properties of conditional expectation. The first is that if $S$ and $T$ random variables with finite mean, then $$\mathbb E[S] = \mathbb E[\mathbb E[S\mid T]].$$

This is what I meant by "iterated expectation." In the case where $S$ and $T$ are discrete random variables, this follows from

\begin{align} \mathbb E[\mathbb E[S\mid T]] &= \mathbb E\left[\sum_s s\mathbb P(S=s\mid T) \right]\\ &= \sum_t\left(\sum_s s\mathbb P(S=s\mid T=t)\right)\mathbb P(T=t)\\ &= \sum_t\sum_s s \mathbb P(S=s\mid T=b)\mathbb P(T=t)\\ &= \sum_t\sum_s s \mathbb P(S=s, T=t)\\ &= \sum_s s\sum_t \mathbb P(S=s, T=t)\\ &= \sum_s s\mathbb P(S=s)\\ &= \mathbb E[S], \end{align} where the interchange in order of summation is justified by Fubini's theorem (as by definition of conditional expectation, $\mathbb E[|\mathbb E[S\mid T]|]<\infty$).

The second one is that if $A\in\sigma(T)$, i.e. $A=T^{-1}(\{t\})$ for some $t$ with $\mathbb P(T=t)>0$ where $$T^{-1}(\{t\}) = \{\omega:T(\omega)=t\}, $$ then $$\mathbb E[S1_A\mid T] = \mathbb E[S\mid T]1_A, $$ where $$1_A(\omega)= \begin{cases}1,&\omega\in A\\0,&\omega\notin A.\end{cases}$$ This follows from $$\mathbb E[S1_A] = \mathbb E[\mathbb E[S\mid T]1_A] $$ (by definition of conditional expectation).