Difference between Conditional vs Joint Probability in Bayes Rule

145 Views Asked by At

I have the following conditional probability to solve (each variable is binary, i.e. is either 0 or 1):

$P(h_2 = 1 | v=1,h_1=0)$

So if we define $A \equiv h_2=1 $ and $ B \equiv v=1,h_1=0 $, I'm looking at $P(A|B)$.

Using Bayes theorem, we have

$P(A|B) = \frac{P(B|A)P(A)}{P(B)}$

Plugging in the values, we have:

$P(h_2 = 1 | v=1,h_1=0) = \frac{P(v=1,h_1=0|h_2=1)P(h_2=1)}{P(v=1,h_1=0)}$

and if we want to condition the denominator on all possible values of $h_2$:

$P(h_2 = 1 | v=1,h_1=0) = \frac{P(v=1,h_1=0|h_2=1)P(h_2=1)}{P(v=1,h_1=0|h_2=0)P(h_2=0) + P(v=1,h_1=0|h_2=1)P(h_2=1)}$ (equation 1)


The thing is, this is part of a Coursera course (Neural Networks w/ Geoff Hinton, Week 13 Quiz, ex 6), where I was told this:

$P(h_2 = 1 | v=1,h_1=0) = \frac{P(v=1|h_1=0,h_2=1)P(h_2=1)}{P(v=1|h_1=0,h_2=0)P(h_2=0) + P(v=1|h_1=0,h_2=1)P(h_2=1)}$ (equation 2)

which is not quite what I worked out myself ( the "|" symbol is moved to the left in equation 2, in the parentheses)

What's the rationale behind this? Is there some sort of equivalence rule between equations 1 & 2 that I'm not aware of?

Thanks.