Gaussian mixture model with penalty

58 Views Asked by At

In this article it is written as follow:

Given the current estimate $(\hat{\pi}_{1}^{(0)},\ldots,\hat{\pi}_{M}^{(0)})$ for $\pi$, to solve \begin{equation*} \frac{\partial}{\partial \pi_m} \left[ \sum_{i=1}^n \sum_{m=1}^M h_{im} \log \pi_m - n\lambda D_{f} \sum_{m=1}^M \log(\epsilon + p_{\lambda}({\pi}_m)) - \beta\left(\sum_{m=1}^M \pi_m - 1\right) \right] = 0, \end{equation*}

where $p_{\lambda}(\cdot)$ is the SCAD penalty

we substitute $\log(\epsilon + p_{\lambda}(\hat{\pi}_m))$ by its linear approximation $$\log(\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m) + \left(\frac{p'_{\lambda}(\hat{\pi}_m^{(0)})}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}\right)\left(\hat{\pi}_{m}-\hat{\pi}^{(0)}_{m} \right)$$

By setting $\sum_{i=1}^{n}\sum_{m=1}^M h_{im} = n$, we first update the value of $\beta$ by \begin{equation*} \beta = n - n\lambda D_{f} \sum_{m=1}^M \frac{p_{\lambda}(\hat{\pi}^{(0)}_m) \hat{\pi}^{(0)}_{m}}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}. \end{equation*}

Now, if I follow what is above, then I have:

\begin{equation*} \frac{\partial}{\partial \pi_m} \left[ \sum_{i=1}^n \sum_{m=1}^M h_{im} \log \pi_m - n\lambda D_{f} \sum_{m=1}^M \left[\log(\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m) + \left(\frac{p'_{\lambda}(\hat{\pi}_m^{(0)})}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}\right)\left(\hat{\pi}_{m}-\hat{\pi}^{(0)}_{m} \right) \right] - \beta\left(\sum_{m=1}^M \pi_m - 1\right) \right] = 0, \end{equation*} $$ \frac{\partial \ell_P}{\partial \pi_j} = \sum_{i=1}^{n} h_{ij} \frac{1}{\pi_j} - n \lambda D_{f} \left(\frac{p'_{\lambda}(\hat{\pi}_j^{(0)})}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_j)}\right)-\beta $$ However, after taking the sum of over $j=1, \ldots,M$ and using the constraint $\sum_{m=1}^{M} \pi_{m}=1$, I do not see where $\hat{\pi}^{(0)}_{m}$ come from in $\beta$

1

There are 1 best solutions below

0
On BEST ANSWER

From

$$ \frac{\partial}{\partial \pi_m} \left[ \sum_{i=1}^n \sum_{m=1}^M h_{im} \left[\log (\hat{\pi}^{(0)}_m)+\frac{1}{\hat{\pi}^{(0)}_m}\left(\pi_m-\hat{\pi}^{(0)}_m\right)\right] - n\lambda D_{f} \sum_{m=1}^M \left[\log(\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m) + \left(\frac{p'_{\lambda}(\hat{\pi}_m^{(0)})}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}\right)\left(\hat{\pi}_{m}-\hat{\pi}^{(0)}_{m} \right) \right] - \beta\left(\sum_{m=1}^M \pi_m - 1\right) \right] = 0, $$

we have

$$ \sum_{i=1}^n h_{im}\frac{1}{\hat{\pi}^{(0)}_m}-n\lambda D_f\left(\frac{p'_{\lambda}(\hat{\pi}_m^{(0)})}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}\right)-\beta=0 $$

now summing up $m$

$$ \sum_{m=1}^M\sum_{i=1}^n h_{im}-n\lambda D_f\sum_{m=1}^M\left(\frac{p'_{\lambda}(\hat{\pi}_m^{(0)})}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}\right)\hat{\pi}^{(0)}_m-\beta\sum_{m=1}^M\hat{\pi}^{(0)}_m = 0 $$

or

$$ n - n\lambda D_f\sum_{m=1}^M\frac{p'_{\lambda}(\hat{\pi}_m^{(0)})\hat{\pi}^{(0)}_m}{\epsilon + p_{\lambda}(\hat{\pi}^{(0)}_m)}-\beta=0 $$