Probability Formula for Posterior With 3 Variables

358 Views Asked by At

First post on math.stackexchange; pardon me if this is naive/a repeat.

I'm following this document here by Prof. David M. Blei: http://www.cs.princeton.edu/courses/archive/fall11/cos597C/lectures/variational-inference-i.pdf

In this, I don't quite understand how he arrived at the posterior distribution formula for p(u, z|x) under section "Motivation" - bullet number 3.

I'm trying to think in terms of vectors. I also know that the formula makes an independence assumption between the individual components of vectors x and z (hence the multiplications with subscript 'i'). I also know that the denominator is p(x), since u and z are obviously being "integrated out". Also, u is a continuous parameter, z is discrete. Hence the integral and the summation in the denominator respectively.

But let's focus only on the numerator right now. The first term, is obviously p(u) (or is it?). The second term is what I'm not able to understand. Is it some form of p(x, z | u)?

Another question: In the "Set up" section - bullet 2, the posterior is defined as p(z | x, α). Note how the α is on the right hand side of the "|". Now, in section "Motivation" - bullet 3, the posterior is defined as p(µ1:K, z1:n | x1:n). A fair assumption (IMO) to make is that α = µ1:K, i.e. the parameters of this model. How did α jump to the left hand side of the "|" in this definition of the posterior then?

Note: The notation used in the above question is as follows:

u : mu, the parameters of the Gaussian distribution assumed.

x : a vector of variables, say x_i, where i = 1:N

z : latent variables associated with each x above. This is also a vector of dimension M.

Thanks, and any/all help is appreciated!