On the gradient of the marginal log-likelihood function

108 Views Asked by At

In the article

to model, the desired probability distribution, estimating the parameters of a probability distribution so that the distribution fits the observed data is presented as an optimization problem of:

$$ \theta^{\text{ML}} = \arg\max_{\theta} \sum_{i=1}^N \log p_{\theta} (x_i) $$

The gradient of the marginal log-likelihood function is then calculated using simple calculus and the Bayes rule:

$$ \nabla \log p_{\theta} (x) = \int p_{\theta} (z | x) \nabla_{\theta} \log p_{\theta} (x, z) \, {\rm d} z$$

where can one find the proof/maths behind this gradient calculation?