Likelihood function for a bayesian network

300 Views Asked by At

I'm in a pre rigorous phase but as a programmer I need to know certain things on a case by case basis to do certain very particular things.

Therefore, I was hoping for validation of why the following, observed in a book about bayesian networks, is true. That being:

Let $\mathcal{G}$ be a bayesian network over $X_1, ..., X_n$. We say that a distribution $P_\beta$ over the same space factorizes according to $\mathcal{G}$ if $P_\beta$ can be expressed a product:

$$\mathbb{P}_\beta (X_1, ..., X_n) = \prod\limits_{i=1}^n\mathbb{P}(X_i \mid \mathbf{Pa}_{X_i})$$

where $\mathbf{Pa}_{X_i}$ is a vector of the observations of the parent nodes of $X_i$

I was wondering where we get the product formula for $\mathbb{P}_\beta$ from. Does it follow directly from the chain rule for probability? If so, can so please do a derivation.

2

There are 2 best solutions below

4
On

Pearl is saying that that if the distribution factorizes in this way, then $\mathcal G$ is a possible structure.

0
On

The decomposition into the product expresses that each $X_i$ is conditionally independent from all other $X_j$ for all $j \neq i$ given the parents of $X_i$ in the Bayesian network.

As a basic example, consider the Bayesian network consisting of four nodes: $X_1 \rightarrow X_2 \rightarrow X_3 \rightarrow X_4.$

Then, $$P(X_1, X_2, X_3, X_4) = P(X_1)P(X_2 | X_1)P(X_3|X_1,X_2)P(X_4|X_1,X_2,X_3)$$ $$= P(X_1)P(X_2 | X_1)P(X_3|X_2)P(X_4|X_3).$$

Why is each node conditionally independent of all other nodes given its parents? This is from the definition of Bayesian networks.