Kronecker delta - substitution issues

240 Views Asked by At

I'm wondering if there are situations where index substitution using Kronecker deltas is not allowed? I'm currently fiddling with differentiation of the Softmax-function where I arrive at the following result $$ \frac{\partial a_i}{\partial z_k} = a_i(\delta_{ik} - a_k). $$ Expanding terms $$ \frac{\partial a_i}{\partial z_k} = a_i\delta_{ik} - a_ia_k. $$ Now I was tempted to simplify to $$ \frac{\partial a_i}{\partial z_k} = a_k - a_ia_k, $$ but that's obviously wrong, as the unsimplified version drops the first term when $i \ne k$, but the simplified does something completly different. Can someone explain what's wrong? Am I missing some contraints on when substitution can be performed and when not?

1

There are 1 best solutions below

0
On

Perhaps you may already know that, but you were not allowed to use Einstein summation convention in this situation. Only if you could, it would be legitimate to simplify the delta.

There is still $i$ index on the left hand side which indicates that the summation is not possible (see the top rule on 2nd page of these notes where the difference between the summation "dummy" index and "free index" is shown (even if the phrase "dummy index" is not used explicitly, that's the index over which you sum)). You may be also interested in the answer to my question on when can we use summation convention and when we cannot.