Understanding marginalizing over many variables using joint probabilities.

1.4k Views Asked by At

If we have a joint probabiliy $p(x, y)$, where each are binary variables having $K$ values, we can compute the marginal probability $p(y)$ as:

$$p(x, y) = p(y|x)p(x) \tag 1$$ $$p(y) = \sum_{x^{*}}p(y|x^{*})p(x^*) \tag 2$$

now say I have many random variables: $x_1, x_2, ... , x_N$, and I want to compute a particular marginal $p(x_a)$. How does one go about doing this naively? Would it be an iterative application of (1) and (2) ?

So, say I have three discrete random variables: $x_1, x_2, x_3$ and I want to get the marginal of $x_2$

$$p(x_1, x_2, x_3) = p(x_2|x_1, x_3)p(x_1, x_3) \tag 3$$ $$p(x_2) = \sum_{x_1}\sum_{x_3}p(x_2|x_1, x_3)p(x_1, x_3) \tag 4$$

if the above is correct, I have to admit that the "meaning" of it is a bit fuzzy. By meaning I mean the steps it is telling me to carry out. My initial thought is to apply (1) and (2) to the second term in (4). Picking which variable I want to marginalize first:

$$p(x_2) = \sum_{x_3}\sum_{x_1}p(x_2 | x_1, x_3)p(x_3|x_1)p(x_1)$$ $$=\sum_{x_3}p(x_2|x_1, x_3)p(x_3) \tag 5$$

the last term in (5) should sum to 1 now leaving $\sum_{x_3}p(x_2|x_1, x_3)$, but now I am confused as I still have of $x_1, x_3$ being conditioned on.

Perhaps the crux of the problem is I am unsure as to what happens with $x_3$ in something like $p(x_2|x_1, x_3)$ as I sum over all $x_3$, does it just drop out and become $p(x_2|x_1)$ ? If so, why?

EDIT:

New insight ... maybe. Say I write (4) as:

$$p(x_2) = \sum_{x_1}\sum_{x_3}p(x_2|x_3, x_1)p(x_1, x_3) \tag 4$$

now I can set $z = x_2|x_3$ and rewrite this as:

$$p(x_2) = \sum_{x_1}\sum_{x_3}p(z, x_1)p(x_1, x_3) \tag 4$$

now I apply (1) and (2) to the second term:

$$p(x_2) = \sum_{x_3}\sum_{x_1}p(z, x_3)p(x_3|x_1)p(x_1)$$

as I sum over $x_1$ I am left with:

$$p(x_2) = \sum_{x_3}p(z, x_1)p(x_3)$$

and I can now substitute $x_2|x_3$ back in place of $z$:

$$p(x_2) = \sum_{x_3}p(x_2|x_3, x_1)p(x_3)$$

oh ... never mind this leaves me with my same issue. So then, is the answer that summing over $x_1$, removes it from $p(x_2|x_3, x_1)$? In which case the result follows pretty straight forwardly

1

There are 1 best solutions below

0
On BEST ANSWER

That last equation should be: $$\begin{align}p(x_2) &= \sum_{x_3}\sum_{x_1}p(x_2 \mid x_1, x_3)p(x_3\mid x_1)p(x_1) \\[1ex]&= \sum_{x_3}\sum_{x_1}p(x_2,x_3\mid x_1)p(x_1)\\[1ex] &=\sum_{x_3}p(x_2,x_3)\\[1ex]&=\sum_{x_3}p(x_2\mid x_3)p(x_3) \tag 5\end{align}$$

Basically, you are marginalising over $x_1$ so it won't appear in the equation.