How to minimize the KL divergence with respect to fixed parameters?

99 Views Asked by At

I read the LDA paper multiple times but I'm having trouble with the following. Let's say I define a LDA model as:

  • For each doc $m$:
    • Sample topic probabilities $\theta_m \sim Dirichlet(\alpha)$
    • For each word $n$:
      • Sample a topic $z_{mn} \sim Multinomial(\theta_m)$
      • Sample a word $w_{mn} \sim Multinomial(\beta)$

where $\alpha, \beta$ are fixed hyperparameters.

I'd like to find some probability distribution that approximates the real probability distribution of the model by minimizing the KL-divergence $K(q_{\gamma_m, \phi_m}(\theta, s) || p(\theta, s|t))$ in terms of $\phi, \gamma$. I'm defining $q_{\gamma, \phi}(\theta, s) = q_{\gamma}(\theta)\Pi_nq_{\phi_n}(s_n), q_\gamma(\theta)$ and $q_{\phi_n}(s_n)$ is multinomial. I'm not even where to start with generating a mean-field variational inference algorithm for this model.

1

There are 1 best solutions below

0
On BEST ANSWER

The detailed derivation can be found in page 1019 of the paper.

It should be noted that there are alternative models that does not use variational inference for topic modeling. For example some people use Gibbs sampling, which is slow for large texts but mathematically simpler.