Jacobian of sigmoid

1k Views Asked by At

I have a $2 \times 1$ matrix $\theta = [\theta_1, \theta_2]^T$ and I want to compute the derivative of $1/ (1 + \exp(-\theta^T x))$ with respect to $\theta$. Now I know that should be a $2 \times 1$ vector with the first element being the derivative of the function with respect to $\theta_1$ and the second - the same but for $\theta_2$, however, I'm still having trouble calculating it...Please help

1

There are 1 best solutions below

0
On BEST ANSWER

Let $\sigma (\theta) = 1 / (1 + \exp(- \theta^T x)).$ We have \begin{align*} \frac{\partial \sigma}{ \partial \theta_i}(\theta) &= \frac{\partial \sigma}{ \partial \theta_i} \frac{1}{1 + \exp(- \theta_1 x_1 - \theta_2 x_2)} \\ &= -\frac{1}{(1 + \exp(- \theta_1 x_1 - \theta_2 x_2))^2} \cdot (-x_i \exp(- \theta_1 x_1 - \theta_2 x_2) ) \\ &= \frac{1}{1 + \exp(- \theta^T x))} \cdot \frac{x_i \exp(- \theta^T x) }{1 + \exp(- \theta^T x)} \\ &= \sigma(\theta) \cdot x_i \cdot (1- \sigma(\theta)). \end{align*}

So $$ \nabla_\theta \sigma(\theta) = \sigma(\theta)(1 - \sigma(\theta)) \odot x, $$ where $\odot$ denotes the elementwise product.