I'm trying to understand the math of back propagation in softmax.
$f (\mathbf {z} )_{j}={\frac {e^{z_{j}}}{\sum _{k=1}^{K}e^{z_{k}}}} $
I can't understand why $\frac{\partial f(z)_i}{\partial f(z)_j}=0$ when $i \neq j$?
Since all the $z_k$ appear in the denominator of $f(z)$, shouldn't changing the $f(z)_j$ affect the value of $f(z)_i$?
Thanks.
You are right, it's not 0. If
\begin{equation} f_j = \frac{e^{z_j}}{\sum_k e^{z_k}} \end{equation}
then
\begin{equation} \frac{\partial f_j}{\partial z_i} = -f_i \cdot f_j,\quad i \neq j \end{equation}
\begin{equation} \frac{\partial f_j}{\partial z_i} = f_i \cdot (1 - f_i),\quad i = j \end{equation}