I am attempting to calculate the partial derivative of the sigmoid function with respect to theta:
$ y = \frac{1}{1+ e^{-\theta x}}$
Let:
$v = -\theta x $
$u = (1 + e^{-\theta x}) = (1 + e^v)$
Then:
$ \frac{\partial y}{\partial u} = -u^{-2}$
$ \frac{\partial u}{\partial v} = e^v $
$ \frac{\partial v}{\partial \theta_i} = -x_i $
So, applying the chain rule:
$ \frac{\partial y}{\partial \theta_i} $
$= \frac{\partial y}{\partial u} \frac{\partial u}{\partial v} \frac{\partial v}{\partial \theta_i}$
$= -u^{-2} e^v (-x_i)$
$= -(1 + e^v)^{-2} e^v (-x_i)$
$= -(1+e^{-\theta x})^{-2} e^{-\theta x} (-x_i)$
$=\frac{-x_ie^{-\theta x}}{-(1+e^{-\theta x})^2} $
At this point, I'm trying to figure out how to get it into this form:
$ \frac{\partial y}{\partial \theta_i} = y(1 - y)$
How do I accomplish this?
Also, it is alleged that:
$1 - \frac{1}{1+ e^{-\theta x}} = \frac{e^{-\theta x}}{1 + e^{-\theta x}}$
How is this possible?
Let $$ f(\theta)=\frac{1}{g(\theta)}=\frac{1}{1+e^{h(\theta)}}=\frac{1}{1+e^{-\theta x}}. $$ First, note that by the Chain rule \begin{align*} f^{\prime}(\theta) & =-g^{\prime}(\theta)/(g(\theta))^{2},\\ g^{\prime}(\theta) & =e^{h(\theta)}h^{\prime}(\theta),\\ \text{and }h^{\prime}(\theta) & =-x. \end{align*} Putting this all together, $$ f^{\prime}(\theta)=-\frac{g^{\prime}(\theta)}{(g(\theta))^{2}}=-\frac{e^{h(\theta)}h^{\prime}(\theta)}{(g(\theta))^{2}}=\frac{e^{-\theta x}x}{(1+e^{-\theta x})^{2}}. $$ Moreover, note that $$ 1-f(\theta)=1-\frac{1}{1+e^{-\theta x}}=\frac{1+e^{-\theta x}}{1+e^{-\theta x}}-\frac{1}{1+e^{-\theta x}}=\frac{1+e^{-\theta x}-1}{1+e^{-\theta x}}=\frac{e^{-\theta x}}{1+e^{-\theta x}}. $$ Therefore, $$ f^{\prime}(\theta)=\frac{1}{1+e^{-\theta x}}\frac{e^{-\theta x}}{1+e^{-\theta x}}x=f(\theta)\left(1-f(\theta)\right)x. $$