Why the derivative of i-th element of softmax in respect to j-th element is 0 when i != j

255 Views Asked by At

I'm trying to understand the math of back propagation in softmax.

$f (\mathbf {z} )_{j}={\frac {e^{z_{j}}}{\sum _{k=1}^{K}e^{z_{k}}}} $

I can't understand why $\frac{\partial f(z)_i}{\partial f(z)_j}=0$ when $i \neq j$?

Since all the $z_k$ appear in the denominator of $f(z)$, shouldn't changing the $f(z)_j$ affect the value of $f(z)_i$?

Thanks.

1

There are 1 best solutions below

1
On

Why the derivative of i-th element of softmax in respect to j-th element is 0 when i != j

You are right, it's not 0. If

\begin{equation} f_j = \frac{e^{z_j}}{\sum_k e^{z_k}} \end{equation}

then

\begin{equation} \frac{\partial f_j}{\partial z_i} = -f_i \cdot f_j,\quad i \neq j \end{equation}

\begin{equation} \frac{\partial f_j}{\partial z_i} = f_i \cdot (1 - f_i),\quad i = j \end{equation}