I'm studying for an exam on Algorithms and Inference and I can't make sense of a mathematical step in a proof.
The theorem states:
Let $P(\mathbf{x}) = \frac{1}{Z}\prod_a \Psi_a (\mathbf{x}_a)$ be a factorized distribution, where the bipartite factor graph of factors $a$ and variables $i$ is a tree. Then
$$P(\mathbf{x}) = \prod_a P_a(\mathbf{x}_a) \prod_i P(x_i)^{1-n_i}= \prod_a \frac{P_a(\mathbf{x}_a)}{\prod_{i \in \partial a}P(x_i)} \prod_i P(x_i)$$
where $n_i$ is the degree of vertex i.
$P(x_i)$ is the marginal distribution of $x_i$
$P_a(\mathbf{x}_a)=\frac{1}{Z} \Psi_a (\mathbf{x}_a)$
The proof goes like this:
The result is valid for a tree with one vertex. In this case the product over $i$ is equal to 1 and the product over $a$ has only one term.
We proceed by induction on the number of nodes in the factor graph. If $x_i$ is a leaf (connected to $b$), $n_i = 1$, then:
\begin{align}P(\mathbf{x}\setminus\{x_i\}) &= \frac{1}{Z} \sum_{x_i}\prod_a \Psi_a (\mathbf{x}_a)\\ &= \frac{1}{Z} \left ( \sum_{x_i} \Psi_b (\mathbf{x}_b)\right)\prod_{a \neq b} \Psi_a (\mathbf{x}_a)\\ &= \prod_{a \neq b} P(\mathbf{x}_a) P(\mathbf{x}_b \setminus \{x_i \})\prod_{j \neq i} P(\mathbf{x}_j)^{1-n_j} \end{align}
Now, the first passage is clear: isolate the function which depends on $x_i$. The last passage is justified by "induction hypothesis" in my notes, but I don't understand it. The proof ends using some identities on conditional probabilities to get the statement of the theorem.
So to make the question clear: how do I go from
$$\frac{1}{Z} \left ( \sum_{x_i} \Psi_b (\mathbf{x}_b)\right)\prod_{a \neq b} \Psi_a (\mathbf{x}_a)$$
to
$$\prod_{a \neq b} P(\mathbf{x}_a) P(\mathbf{x}_b \setminus \{x_i \})\prod_{j \neq i} P(\mathbf{x}_j)^{1-n_j}$$ ?