How to minimize the KL divergence with respect to fixed parameters?

102 Views Asked by Bumbble Comm At 12 Apr 2026 - 4:21

I read the LDA paper multiple times but I'm having trouble with the following. Let's say I define a LDA model as:

For each doc $m$:
- Sample topic probabilities $\theta_m \sim Dirichlet(\alpha)$
- For each word $n$:
  - Sample a topic $z_{mn} \sim Multinomial(\theta_m)$
  - Sample a word $w_{mn} \sim Multinomial(\beta)$

where $\alpha, \beta$ are fixed hyperparameters.

I'd like to find some probability distribution that approximates the real probability distribution of the model by minimizing the KL-divergence $K(q_{\gamma_m, \phi_m}(\theta, s) || p(\theta, s|t))$ in terms of $\phi, \gamma$. I'm defining $q_{\gamma, \phi}(\theta, s) = q_{\gamma}(\theta)\Pi_nq_{\phi_n}(s_n), q_\gamma(\theta)$ and $q_{\phi_n}(s_n)$ is multinomial. I'm not even where to start with generating a mean-field variational inference algorithm for this model.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 21 Oct 2020 - 4:10 BEST ANSWER

The detailed derivation can be found in page 1019 of the paper.

It should be noted that there are alternative models that does not use variational inference for topic modeling. For example some people use Gibbs sampling, which is slow for large texts but mathematically simpler.

How to minimize the KL divergence with respect to fixed parameters?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in STATISTICAL-INFERENCE

Related Questions in MACHINE-LEARNING

Related Questions in VARIATIONAL-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions