Use of logarithmic identities in dirichlet log-likelihood derivation

75 Views Asked by At

I don't understand what is happening in the derivation of the log-likelihood of the dirichlet distribution as shown in this paper (page 5), from line 2 to 3 of the equations.


Formulation in the paper*

Given a set of multinomial data $\mathcal{D}=\{\mathbf{p}_1,\mathbf{p}_2,\cdots,\mathbf{p}_N,\}$ the author derives the log-likelihood of the dirichlet distribution with parameters $\alpha=\{\alpha_1,\alpha_2,\cdots,\alpha_q\}$ as follows

$$ \begin{align} F(\alpha)=\log\: p(\mathcal{D}|\alpha) &= \log\prod_{i=1}^{N}p(\mathbf{p}_i|\alpha)\\ &=\log\prod_{i=1}^{N}\frac{\Gamma(\sum_{k=1}^{q}\alpha_k)}{\prod_{k=1}^{q}\Gamma(\alpha_k)} \prod_{k=1}^{q}p_{ik}^{\alpha_k -1}\\ &=N\left(\log \Gamma (\sum_{k=1}^{q}\alpha_k) - \sum_{k=1}^{q} \log \Gamma(\alpha_k) +\sum_{k=1}^{q}(\alpha_k - 1) \log \hat{p}_k \right) \end{align} $$ with $\log\hat{p}_k=\frac{1}{N}\sum_{i=1}^{N}\log p_{ik}$.

*: I added a definition for the $\alpha$ vector and added sub- and superscripts for the sums/products


I understand that the fraction and the exponentiation in line 2 are converted using the logarithmic identities for fractions and powers respectively.
What I don't understand however is how the $\log\prod_{i=1}^{N}$ at the beginning of line 2 changes merely into $N\cdot(...)$ in line 3. I would have expected the product over N to be converted into a sum.

Can someone explain to me what exactly was done to arrive at the equation in line 3?