Bayes' Theorem and Independence

398 Views Asked by At

I'm deriving Probabilistic latent semantic analysis model. In the model, documents $d$ and words $w$ are observed. $$\begin{align} \Pr(d,w) &= \sum_{c} \Pr (d,w,c) \\ &= \sum_{c} \Pr(d) \Pr(w,c|d)\\ &= \Pr(d) \sum_{c} \frac{\Pr(w,c,d)}{\Pr(d)} \\ &= \Pr(d) \sum_{c} \Pr(w|c,d) \Pr(c|d) \end{align}$$

I used Bayes' theorem from the second line to the third. However, Wikipedia says $\Pr(d) \sum_{c} \Pr(w|c) \Pr(c|d)$. Is this because $d$ and $w$ are independent? If so, how can I know the independence from the grahical model (plate notation)?

2

There are 2 best solutions below

0
On BEST ANSWER

You should try to provide more information about what exactly the random variables are (the Wikipedia article is also very poor in information). From what Wikipedia says, $d$ and $w$ are assumed to be conditionally independent given $c$. This means that $$\Pr(d,w,c) = \Pr(c)\Pr(d|c)\Pr(w|c).$$ Now using Bayes's theorem we obtain that this quantity equals $\Pr(d)\Pr(c|d)\Pr(w|c)$.

You could also continue your line of thought, because conditional independence is equivalent to saying that $\Pr(w|c,d) = \Pr(w|c)$.

0
On

The third line is redundant ~ you have basically only said $A = D(A/D)$ ~ and confusing.   Why bother with it?

Instead immediately skip to the last line by just applying the definition of conditional probability.

$$\begin{align} \Pr(d,w) &= \sum_{c} \Pr (d,w,c) \\ &= \sum_{c} \Pr(d) \Pr(w,c\mid d)\\ &= \Pr(d) \sum_{c} \Pr(w\mid c,d) \Pr(c\mid d) \end{align}$$

Anyhow, to say that $\Pr(w\mid c,d)=\Pr(w\mid c)$ is to assert that $w,d$ are conditionally independent for any given $c$. That is indeed the model, as the article stated:

Considering observations in the form of co-occurrences ${\displaystyle (w,d)}$ of words and documents, PLSA models the probability of each co-occurrence as a mixture of conditionally independent multinomial distributions:

Conditional Independence of $w,d$ given $c$ is defined as when: $\Pr(w,d\mid c)=\Pr(w\mid c)\Pr(d\mid c)$

So we have:

$$\begin{align} \Pr(d,w) &= \sum_{c} \Pr (d,w,c) && \text{Law of Total Probability} \\ &= \sum_{c} \Pr(c) \Pr(d,w\mid c) && \text{defn. Conditional Probability}\\ &= \sum_{c} \Pr(c) \Pr(d\mid c) \Pr(w\mid c) && \text{because conditional independence} \\ &= \sum_c \Pr(c\mid d)\Pr(d)\Pr(w\mid c) && \text{Bayes' Rule}\\ &= \Pr(d)\sum_c \Pr(w\mid c)\Pr(c\mid d) \end{align}$$

In this model, $c$ is a latent variable