I have some confusion regarding computing the partial derivative for this function.
Let $\sigma(z) = \frac{1}{1 + e^{-z}}$
Let $z = \textbf{w}^\mathsf{T}\textbf{x} + \textbf{b}$ (this is a linear function where w represents a vector of weights).
Let $y = \sigma(z)$.
I'm a bit rusty on my Calculus at the moment, and so I am trying to compute:
$$\frac{\partial y}{\partial z} = \frac{e^{-z}}{(1 + e^{-z})^2} \leftarrow \text{From solutions}$$
My understanding of how to compute it as follows:
$$\frac{\partial y}{\partial z} = \sigma'(z) \cdot \frac{\partial}{\partial z}(\textbf{w}^\mathsf{T}\textbf{x} + \textbf{b})$$
What is the $\frac{\partial}{\partial z}(\textbf{w}^\mathsf{T}\textbf{x} + \textbf{b})$? From what I see from the solutions, it should be 1, but shouldn't it be 0? If you hold every variable constant (and there are no values here wrt z), then the derivative should be 0.
Moreover, they reach an identity that is equivalent to the solution of the partial derivative as $y(1-y)$. How did this happen? I am aware there is an identity $\sigma'(z) = \sigma(z)(1-\sigma(z))$ but I am not sure I understand how this was derived.
Thanks!
(note that $b$ is not a vector)
Since $z=\textbf{w}^\mathsf{T}\textbf{x} + {b}$, you have $$ \frac{\partial}{\partial z}(\textbf{w}^\mathsf{T}\textbf{x} + \textbf{b}) =\frac{\partial}{\partial z}(z)=1 $$ Now, the function $\sigma$ is a one-variable function, and I fail to see why you write a one-variable derivative as a partial derivative. You can use the one-variable chain rule to calculate the derivative: $$ \sigma'(z)=-\frac1{1+e^{-z}}\,(e^{-z})'=\frac{e^{-z}}{(1+e^{-z})^2}. $$ Or you can omit the chain rule and use the rule for the derivative of a quotient: $$ \sigma'(z)=\frac{0\times(1+e^{-z})-1\times (-e^{-z})}{(1+e^{-z})^2}=\frac{e^{-z}}{(1+e^{-z})^2}. $$