Latent Dirichlet Allocation: Newton-Raphson method

92 Views Asked by At

I am trying to understand one step in the Newton-Raphson method, used in the paper on LDA by Blei, Ng and Jordan. Namely, how does taking the derivative of $L$, w.r.t. $\alpha_j$ result in this equation. Also, what exactly is $\delta(i,j)$?

The terms which contain $\alpha$ are:

$$ L_{[\alpha]} = \sum_{d=1}^{M} \bigg( \log\Gamma(\sum_{j=1}^k \alpha_j ) - \sum_{i=1}^{k} \log \Gamma(\alpha_i) + \sum_{i=1}^{k}((\alpha_i-1)( \Psi(\gamma_{d_i}) - \Psi\sum_{j=1}^{k} \gamma_{d_j}))) \bigg)$$ Taking the derivative with respect to $\alpha_i$ gives: $$\frac{\partial L}{\partial \alpha_i} = M ( \Psi( \sum_{j=1}^{k} \alpha_j) - \Psi (\alpha_i)) + \sum_{d=1}^{M}( \Psi(\gamma_{d_i}) - \Psi(\sum_{j=1}^{k} \gamma_{d_j}))$$ The derivative depends on $\alpha_j$ where $j \neq i$, and we therefore must use and iterative method to find the maximal $\alpha$. In particular, the Hessian is in the form found in Eq. (10): $$ \frac{\partial L}{\partial \alpha_i \alpha_j} = \delta(ij)M \Psi'(\alpha_i) - \Psi'(\sum_{j=1}^{k} \alpha_j)$$

1

There are 1 best solutions below

0
On BEST ANSWER

It seems like there is some typo in the paper. The actual Hessian should be of the form:

$$ \frac{\partial L}{\partial \alpha_i \alpha_j} = M(\Psi'(\sum_{j=1}^{k} \alpha_j)-\delta(ij) \Psi'(\alpha_i))$$

Note the difference.

  • Here, "M" is multiplied by both the terms instead of just one term.
  • The signs are reversed.