I am referring to these lecture slides on EM estimation of GMM. In particular I have confusion in steps on slide 13 and 14.
If we have a $N$ component GMM (defined by parameters $\theta$), likelihood of observation $x_i$ is given as,
$p(x_i \vert \theta) = \sum _{j=1}^N P(j \vert \theta) \cdot p(x_i \vert j, \theta)$
First term in the product on RHS is the mixture weight, second term is the likelihood from an individual $j$'th Gaussian.
Then they introduce a hidden variable $Q$ that describes which Gaussian generated the sample point.
There is also an indicator variable defined as
$z_{i,j} = 1$ : If $x_i$ came from Gaussian $j$
$z_{i,j} = 0$ : otherwise
I do not get how they go from here to the following
$p(x_i,Q \vert \theta) = P(j \vert \theta)^{z_{i,j}} \cdot P(x_i \vert j, \theta)^{z_{i,j}}$
My reasoning is as follows -
$p(x_i,Q \vert \theta) = p(x_i \vert Q, \theta) \cdot p(Q \vert \theta)$
It can be seen that the first term on RHS (because if we know Q, we know which Gaussian the point came from),
$p(x_i \vert Q , \theta) = P(x_i \vert j, \theta)^{z_{i,j}}$
How to interpret the second term $p(Q \vert \theta)$ ? How to show that it equals the mixture weight of $j$th Gaussian (i.e. $P(j \vert \theta)^{z_{i,j}}$)?