Show that: \[\mathbb{E}[X \mid Y=y]=\sum_{x \in X(\Omega)} x \mathbb{P}(X=x \mid Y=y)\]

85 Views Asked by At

Let $X, Y \in \mathcal{L}^{2}$ be two discrete random variables. Show that $$\mathbb{E}[X \mid Y=y]=\sum_{x \in X(\Omega)} x \mathbb{P}(X=x \mid Y=y)$$ for all $y \in Y(\Omega)$ such that $\mathbb{P}(Y=y) > 0$.

Attempt 2 (I think this is the solution):

Let $X, Y \in \mathcal{L}^{2}$ be two discrete random variables. We want to show that $ \mathbb{E}[X \mid Y=y]=\sum_{x \in X(\Omega)} x \cdot \mathbb{P}(X=x \mid Y=y) $ for all $y \in Y(\Omega)$ such that $\mathbb{P}(Y=y)>0$.

The conditional expectation of $X$ given $Y=y$ is the expectation of $X$ under the condition that the event $\{Y=y\}$ occurs. That is, $ \mathbb{E}[X \mid Y=y] = \frac{\mathbb{E}[X \cdot \mathbbm{1}_{\{Y=y\}}]}{\mathbb{P}(Y=y)} $ where $\mathbbm{1}_{\{Y=y\}}$ is the indicator function that takes value 1 when $Y=y$ and 0 otherwise.

The numerator of the fraction can be written as the sum over all possible values of $X$ that coincide with the event $\{Y=y\}$: $ \mathbb{E}[X \cdot \mathbbm{1}_{\{Y=y\}}] = \sum_{x \in X(\Omega)} x \cdot \mathbb{P}(X=x \cap Y=y) $ where the probability $\mathbb{P}(X=x \cap Y=y)$ can be expressed using the conditional probability as: $\mathbb{P}(X=x \cap Y=y) = \mathbb{P}(X=x \mid Y=y) \cdot \mathbb{P}(Y=y)$

Substituting this expression into the numerator and simplifying, we obtain: $\mathbb{E}[X \mid Y=y] = \frac{\sum_{x \in X(\Omega)} x \cdot \mathbb{P}(X=x \mid Y=y) \cdot \mathbb{P}(Y=y)}{\mathbb{P}(Y=y)} = \sum_{x \in X(\Omega)} x \cdot \mathbb{P}(X=x \mid Y=y)$

This proves the desired result.

Attempt 1 (old version):

We can use the definition of conditional expectation to show that $\mathbb{E}[X \mid Y=y]=\sum_{x \in X(\Omega)} x \mathbb{P}(X=x \mid Y=y)$. Recall that the conditional expectation of $X$ given $Y$ is defined as $\mathbb{E}[X \mid Y]=\mathbb{E}[X \mid \sigma_Y]$, where $\sigma_Y=Y^{-1}(\mathcal{B}_\mathbb{R})$ is the sigma algebra generated by $Y$. By the tower property of conditional expectation, we have $\mathbb{E}[\mathbb{E}[X \mid Y]]=\mathbb{E}[X]$.

Now, let $g(y)=\mathbb{E}[X \mid Y=y]$. By Lemma 1.4.11 in Definition 1.4.10, there exists a unique Borel-measurable function $g:\mathbb{R} \to \mathbb{R}$ such that $\mathbb{E}[X \mid Y]=g(Y)$ almost surely. Therefore, $\mathbb{E}[X]=\mathbb{E}[\mathbb{E}[X \mid Y]]=\mathbb{E}[g(Y)]$.

By definition of conditional expectation, we have $\mathbb{E}[X \mid Y=y]=\sum_{x \in X(\Omega)} x \mathbb{P}(X=x \mid Y=y)$ almost surely for each $y \in Y(\Omega)$. Thus, we have

\begin{align*} \mathbb{E}[g(Y)] &= \sum_{y \in Y(\Omega)} \mathbb{E}[X \mid Y=y] \mathbb{P}(Y=y) \\ &= \sum_{y \in Y(\Omega)} \sum_{x \in X(\Omega)} x \mathbb{P}(X=x \mid Y=y) \mathbb{P}(Y=y) \\ &= \sum_{x \in X(\Omega)} x \sum_{y \in Y(\Omega)} \mathbb{P}(X=x, Y=y) \\ &= \sum_{x \in X(\Omega)} x \mathbb{P}(X=x) \\ &= \mathbb{E}[X]. \end{align*}

Therefore, $\mathbb{E}[X \mid Y=y]=\sum_{x \in X(\Omega)} x \mathbb{P}(X=x \mid Y=y)$ almost surely for each $y \in Y(\Omega)$.

Mentioned Definition and Lemma:

  1. Conditional expectation with respect to a random variable $ Y $:

Definition 1.4.10: Let $ X $ and $ Y $ be two random variables defined on the probability space $ (\Omega, \mathcal{F}, P) $. The conditional expectation of $ X $ given $ Y $ is defined as $ \mathbb{E}(X \mid Y)=\mathbb{E}\left(X \mid \sigma_{Y}\right) $, where $ \sigma_{Y} $ is the sigma-algebra generated by $ Y $: $ \sigma_{Y}=Y^{-1}\left(\mathcal{B}_{\mathbb{R}}\right) $.

Lemma 1.4.11: There exists a unique Borel-measurable function $ g $ : $ \mathbb{R} \rightarrow \mathbb{R} $, such that $ \mathbb{E}(X \mid Y)=g(Y) $ almost surely.

Therefore, the notation $ \mathbb{E}(X \mid Y=y) $ is understood as $ g(y) $: $ \mathbb{E}(X \mid Y= y)=g(y) $ or $ \mathbb{E}(X \mid Y=y) $ is the value of $ \mathbb{E}(X \mid Y) $ on the set $ \{\omega \in \Omega: Y(\omega)=y\} $.