EM Algorithm: How do you resolve parameters getting "caught" inside expectations of unobserved components?

15 Views Asked by At

I am implementing an EM algorithm for a new model, and I am running into trouble: I want to maximize the expected log likelihood with respect to some parameters, but those parameters get "caught" inside the expectation of the random effects (unobserved variables). Let me elaborate:

For the following example, I would like to estimate $\alpha$. To simplify the expected log-likelihood, I have gathered terms not involving $\alpha$ into the "constant". The log-likelihood term and score term for a given observation $i$ would be the following. Here, $d_i$ is observed, $w_i$ is unobserved and $\lambda$ is a parameter of interest.

$$ E \ell_i(\alpha) = \alpha E(\ln w_i) - d_i \cdot \lambda \cdot E(w_i^\alpha) + \text{constant} \\ S_i(\alpha)= E(\ln w_i) - d_i \cdot \lambda \cdot E(\ln(w_i) \cdot w_i^{\alpha}) \\ 0 = \sum_i E(\ln w_i) - \sum_i d_i \cdot \lambda \cdot E(\ln(w_i) \cdot w_i^{\alpha}) $$

The problem is that the only remaining $\alpha$ in this equation is necessary to compute the expectation of the random effects.

Typically, in the EM algorithm, we think about calculating these expectations with respect to the parameter estimates from the previous iteration of the algorithm. Unfortunately, that gives me nothing to maximize over in THIS iteration.

Just for fun, I've tried to bring $\alpha$ outside the expectation with the following approximation. This version of the algorithm does not converge.

$$\sum_i E(\ln w_i) = \sum_i d_i \cdot \lambda \cdot E(\ln w_i) \cdot E(w_i)^{\alpha}$$

Has anyone run into a similar problem before? How have you handled it? Is there some kind of trick I am missing? Maybe some kind of taylor expansion?