Logistic regression maximum likelihood derivation

392 Views Asked by At

the following equations are given:

$\sum_{j=1}^c\hat{P}_j = 1$

$\sigma_i(\mathbf{z}; \theta) = \frac{exp(\mathbf{\theta}_i^T\mathbf{z})}{\sum_{j=1}^cexp(\mathbf{\theta}_j^T\mathbf{z})}$

$L = \sum_{j=1}^c \hat{P}_j \, log(\sigma_j(\mathbf{z};\theta))$

How can I prove the following?

$\nabla_{\theta_i}L = (\hat{P}_i - \sigma_i(\mathbf{z};\theta))\,\mathbf{z}$

This is from the following paper: http://icml.cc/2012/papers/389.pdf Equation number 19.

Thanks in advance.

1

There are 1 best solutions below

0
On

$\textbf{hint}$

$$ \log(\sigma_j(\textbf{z};\theta)) = \theta_{i}^{T}\textbf{z} -\log\left(C(\textbf{z},\theta)\right) $$ where $$ C(\textbf{z},\theta) = \sum_{j=1}^{c}\mathrm{e}^{\theta_{i}^{T}\textbf{z}} $$

and $$ \bigtriangledown_{\theta_j}\log\left(C(\textbf{z},\theta)\right) = \dfrac{\bigtriangledown_{\theta_j} C(\textbf{z},\theta)}{C(\textbf{z},\theta)} $$

can you take it from here?