Applying the chain rule (probability) with three variables

7.1k Views Asked by At

We're currently implementing the IBM Model 1 in my course on statistical machine translation and I'm struggling with the following appplication of the chain rule. When applying the model to the data, we need to compute the probabilities of different alignments given a sentence pair in the data. In other words to compute $\Pr(a\mid e,f)$, the probability of an alignment given the English and foreign sentences.

Why do I end up with

$$ \Pr(a\mid e,f) = \frac{\Pr( e,a \mid f )}{\Pr( e \mid f )} $$

applying the chain rule which would be

$$ \Pr(A,B,C) = \Pr(A)\Pr(B \mid A)\Pr (C \mid B,A) $$

2

There are 2 best solutions below

1
On BEST ANSWER

As you mentioned, chain rule states that $Pr(A,B,C)=Pr(A)Pr(B∣A)Pr(C∣B,A)$

from there you get that $\frac{Pr(A,B,C)}{Pr(A)} = Pr(B∣A)Pr(C∣B,A)$

but the left hand side is nothing else than the conditional probability of B,C given A (Bayes rule): $\frac{Pr(A,B,C)}{Pr(A)} = Pr(B,C|A)$ Replacing the left hand side we get: $Pr(B,C|A)=Pr(B∣A)Pr(C∣B,A)$ from where it follows that: $Pr(C∣B,A) = \frac{Pr(B,C|A)}{Pr(B∣A)}$ Now change "A" to "f" for foreign, "B" to "e" for english and "C" to "a" for alignment and you will get the desired formula: $Pr(a∣e,f) = \frac{Pr(e,a|f)}{Pr(e∣f)}$

0
On

Hint: $\Pr(a\mid e,f) = \frac{\Pr( a,e, f )}{\Pr( e, f )}$ and $ \frac{\Pr( e,a \mid f )}{\Pr( e \mid f )} = \frac{\Pr( e,a, f )}{\Pr( e, f )} \frac{\Pr( f )}{\Pr( f )} .$ Just Baye's rule.