Factorization of conditional probability in a Markov network

318 Views Asked by At

In Barber's book, Definition 4.6 (p. 62) we have a Markov network Markov network http://www.fluxus-virus.com/fichiers/screenshot_322_1.png

(In the following we use the notation $p(1,7\mid 4)$ for $P(X_1,X_7\mid X_4)$ and $\phi(1,2,3)$ for $\phi(X_1,X_2,X_3)$.)

By definition we have $$p(1,2,3,4,5,6,7)=\frac1Z\phi(1,2,3)\phi(2,3,4)\phi(4,5,6)\phi(5,6,7)$$since $(1,2,3)$, etc. are the maximal cliques ($Z$ is a normalization term).

Then the book says that

\begin{align} p(1,7\mid 4)&\propto \sum_{2,3,5,6}p(1,2,3,4,5,6,7)\\ &=\sum_{2,3,5,6}\phi(1,2,3)\phi(2,3,4)\phi(4,5,6)\phi(5,6,7)\\ &= \left(\sum_{2,3}\phi(1,2,3)\phi(2,3,4)\right)\left(\sum_{5,6}\phi(4,5,6)\phi(5,6,7)\right).\end{align}

All this is clear by marginalization, definition of Markov network and writing of a sum of products as product of sums.

What is not clear to me, and this is my question, is that the book then says that this implies that $p(1,7\mid 4)=p(1\mid 4)p(7\mid 4)$.

I can see that $p(1,7\mid 4)$ is a product of a function of 1, 4 and a function of 7, 4 but why exactly $p(1\mid 4)p(7\mid 4)$?

In particular, if $p(X_1,X_2\mid X_3)=f(X_1,X_3)g(X_2,X_3)$ for some functions $f$ and $g$, does this imply that $X_1$ and $X_2$ are independent conditionally to $X_3$? (This is true in the unconditional case.)

2

There are 2 best solutions below

0
On

Yes, this is generally true.

If we assume $$p(X_1,X_2∣X_3)=k f(X_1,X_3)g(X_2,X_3)$$ for some constant k and positive functions f and g, then the normalisation constraint requires that (for discrete variables)

$$k= \frac{1}{\sum_{X_1,X_2} f(X_1,X_3)g(X_2,X_3)}$$ Hence

$$p(X_1,X_2∣X_3)= \frac{f(X_1,X_3)g(X_2,X_3)}{\sum_{X_1,X_2} f(X_1,X_3)g(X_2,X_3)}$$ and

$$p(X_1,X_2∣X_3)= \frac{f(X_1,X_3)g(X_2,X_3)}{(\sum_{X_1} f(X_1,X_3))(\sum_{X_2} g(X_2,X_3))}$$ This can be written

$$p(X_1,X_2∣X_3)= \frac{f(X_1,X_3)}{\sum_{X_1} f(X_1,X_3)}\frac{g(X_2,X_3)}{\sum_{X_2} g(X_2,X_3)}=p(X_1|X_3)p(X_2|X_3)$$

0
On

You mentioned that this is true in the unconditional case.

I just want to mention how the conditional case easily follows from the unconditional case.

If we condition to the event $X_3 = x_3$ (where $x_3$ is one of the values that $X_3$ can take), we obtain $$p(X_1, X_2 | X_3 = x_3) = f(X_1, x_3) g(X_2, x_3)$$ In words, the conditional probabilities of $(X_1, X_2)$ given the event $X_3 = x_3$ is a product of a function of $X_1$ and a function of $X_2$ (of course, these functions themselves depend on $x_3$.). Therefore $X_1, X_2$ are independent when conditioned to the event $X_3 = x_3$. Since $x_3$ was arbitrary, the claim follows.