Problem seeing how $P(X_1=x_1|X_2=x_2) = E[P(X_1=x_1|X_3) | X_2 = x_2]$

135 Views Asked by At

I am reading a text where they claim that $$P(X_1=x_1|X_2=x_2) = \\ ​E[P(X_1=x_1|X_3) | X_2 = x_2] =\\ \sum_{x_3} P(X_1 = x_1|X_2 = x_2, X_3 = x_3)P(X_3=x_3|X_2=x_2) \quad (1.),$$ where $X_1, X_2, X_3$ are discrete random variables and $P(A, B)$ is notation for $P(A \land B)$.

I can see why $$P(X_1=x_1|X_2=x_2) = \sum_{x_3} P(X_1 = x_1|X_2 = x_2, X_3 = x_3)P(X_3=x_3|X_2=x_2)$$ since this is a consequence of that the events $X_3 = x_3$ for all $x_3$ in the range of $X_3$ partitions the sample space, combined with the law of total probability and the definition of conditional probability. But I don't see how the first or second equality holds in $(1.)$. It seems as I have the wrong interpretation of

$$​E[P(X_1=x_1|X_3) | X_2 = x_2] \quad (2.) $$

I am looking to understand where my understanding of this expression $(2.)$ is wrong. This is how I interpret it:

The random variables are defined over some sample space $S$ so that for example $X_1 = x_1$ is just another way of writing $X_1(s) = x_1$ for some $s \in S$. Define the random variable $$X_4: S \to [0, 1], \quad X_4(s) = P(X_1=x_1|X_3 = X_3(s)) \quad (3.)$$ Then we have that $(2.)$ can be written as $$ E[P(X_1=x_1|X_3) | X_2 = x_2] = E[X4|X_2 = x_2] = \sum_{x_4} x_4 P(X_4=x_4 | X_2 = x_2) $$ Here I am stuck. One reason is that I don't understand how to go from summing over $x_4$ to summing over $x_3$ as is done in $(1.)$. Have I started correctly and how can I proceed?

4

There are 4 best solutions below

6
On BEST ANSWER

I don't think we can equate $P(X_1=x_1|X_2=x_2)$ and $E\Big(P(X_1=x_1|X_3)|X_2=x_2\Big)$. To see an example of this, suppose that $(X_1,X_2,X_3)\sim p$ where $p$ is the pmf defined below: $$p(1,1,1)=0.2 \\ p(1,1,2)=0.1 \\ p(1,2,1)=0.01 \\ p(1,2,2)=0.13 \\ p(2,1,1)=0.06 \\ p(2,1,2)=0.11 \\ p(2,2,1)=0.09 \\ p(2,2,2)=0.3 $$ Assume $p(x,y,z)=0$ for all $(x,y,z)\notin \{1,2\}^3$. It's not difficult to verify $$P(X_1=1|X_2=2)=\frac{14}{53}$$ On the other hand, we get with LOTUS that $$\begin{eqnarray*}E\Big(P(X_1=1|X_3)|X_2=2\Big) &=& \sum_{a,b\in \{1,2\}}P(X_1=1|X_3=b)P(X_1=a,X_3=b|X_2=2) \\ &=& \sum_{b\in \{1,2\}}P(X_1=1|X_3=b) \sum_{a\in \{1,2\}}P(X_1=a,X_3=b|X_2=2) \\ &=& \sum_{b\in \{1,2\}}P(X_1=1|X_3=b)P(X_3=b|X_2=2) \\ &=& \frac{7}{12}\cdot \frac{10}{53}+ \frac{23}{64} \cdot \frac{43}{53} \\ &\neq & \frac{14}{53} \end{eqnarray*}$$ If you wish to carry out this computation without the aid of LOTUS (as you started to do) we would need first to establish the conditional pmf of $P(X_1=1|X_3)$ given $X_2=2$. A brief calculator exercise reveals the random variable $P(X_1=1|X_3)$ is supported on the set $\Big\{\frac{7}{12},\frac{23}{64}\Big\}$ and satisfies $$P\Big(P(X_1=1|X_3)=\frac{7}{12}\Big|X_2=2\Big)=P(X_3=1|X_2=2)=\frac{10}{53}$$ $$P\Big(P(X_1=1|X_3)=\frac{23}{64}\Big|X_2=2\Big)=P(X_3=2|X_2=2)=\frac{43}{53}$$ Finally, $$\begin{eqnarray*}E\Big(P(X_1=1|X_3)|X_2=2\Big)&=&\sum_{t\in\big\{\frac{7}{12},\frac{23}{64}\big\}}t P\Big(P(X_1=1|X_3)=t|X_2=2\Big) \\ &=& \frac{7}{12}\cdot \frac{10}{53}+ \frac{23}{64} \cdot \frac{43}{53} \\ &\neq& \frac{14}{53} \end{eqnarray*}$$ As I mentioned in the comments, we may certainly conclude that $$P(X_1=1|X_2=2)=E\Big(P(X_1=1|X_2,X_3)|X_2=2\Big)$$

0
On

Consider the random variable $X_3$ comes from a joint probability $p(x_1, x_2, x_3)$.

If you think everything that happens(including $P(X_1=x_1|X_3)$) is always constrained by the condition $X_2 = x_2$, then the equalities hold.

In 3.) you are assuming the function is not constrained by the conditoning.

For example, if the original sample space is $$\{(x_1,x_2,x_3)~|~x_k=0 ~\textrm{or}~ 1 \textrm{ for } ~k=1,2,3\}, $$then after the conditioning by $x_2=1$, what you have is $$\{(x_1,1,0),(x_1,1,1)\}, ~x_1=0\textrm{ or }1.$$ The expectation in the second line works on this smaller sample space with preassigned value $x_2=1.$

4
On

I agree with @MatthewPilling's comment. The text is wrong to assert that the middle entity $$E[P(X_1=x_1|X_3) | X_2 = x_2] $$ is equal to the outer two entities (which, as you've shown, are indeed equal to each other).

In fact for events $A$, $B$ and a discrete random variable $Y$ taking values $y_1,\ldots,y_n$, $$E( P (A\mid Y) \mid B) = \sum_{i=1}^n P(A\mid Y=y_i) P(Y=y_i\mid B),\tag1$$ which doesn't equal $P(A\mid B)$. To prove (1), recall that $P(A\mid Y)$ is a random variable $h(Y)$ such that $h(y):=P(A\mid Y=y)$. So $$ E(P(A\mid Y)\mid B)= E(h(Y)\mid B)=\sum_i h(y_i) P(Y=y_i\mid B) $$ and the result follows.

0
On

One other way of viewing it is through the law of total expectation $\mathcal{F}_1\subset \mathcal{F}_2$ then $E[X|\mathcal{F}_2|\mathcal{F}_1]=E[X|\mathcal{F}_1]$ . Since $\sigma(X_2)\subset \sigma(X_2,X_3)$ then we know

$$E[X|\sigma(X_2,X_3)|\sigma(X_2)]=E[X|\sigma(X_2)]$$

If we let $X=\mathbb{I}_{x_1}(X_1)$

Then we get the desired equality, as noted with the other posts when conditioning on both $X_2, X_3$ instead of just $X_2$.

My first post I mistakenly miscalculated the conditional expectations.