I have the following conditional probability to solve (each variable is binary, i.e. is either 0 or 1):
$P(h_2 = 1 | v=1,h_1=0)$
So if we define $A \equiv h_2=1 $ and $ B \equiv v=1,h_1=0 $, I'm looking at $P(A|B)$.
Using Bayes theorem, we have
$P(A|B) = \frac{P(B|A)P(A)}{P(B)}$
Plugging in the values, we have:
$P(h_2 = 1 | v=1,h_1=0) = \frac{P(v=1,h_1=0|h_2=1)P(h_2=1)}{P(v=1,h_1=0)}$
and if we want to condition the denominator on all possible values of $h_2$:
$P(h_2 = 1 | v=1,h_1=0) = \frac{P(v=1,h_1=0|h_2=1)P(h_2=1)}{P(v=1,h_1=0|h_2=0)P(h_2=0) + P(v=1,h_1=0|h_2=1)P(h_2=1)}$ (equation 1)
The thing is, this is part of a Coursera course (Neural Networks w/ Geoff Hinton, Week 13 Quiz, ex 6), where I was told this:
$P(h_2 = 1 | v=1,h_1=0) = \frac{P(v=1|h_1=0,h_2=1)P(h_2=1)}{P(v=1|h_1=0,h_2=0)P(h_2=0) + P(v=1|h_1=0,h_2=1)P(h_2=1)}$ (equation 2)
which is not quite what I worked out myself ( the "|" symbol is moved to the left in equation 2, in the parentheses)
What's the rationale behind this? Is there some sort of equivalence rule between equations 1 & 2 that I'm not aware of?
Thanks.