Question about derivations of "CS231N Backpropagation gradient"

95 Views Asked by Bumbble Comm At 27 Mar 2026 - 11:46

There is a great explanation of the calculation of backpropagation gradient in the CS231n class. Please find the question here.

twolffpiggott's answer improved my general understanding. However, I've got stuck at one of the derivations. My first question is about this formula:

$$ \frac{\partial\mathcal{L}_i}{\partial \boldsymbol{w_j}} = \sum_{k=1}^{K} \frac{\partial\mathcal{L}_i}{\partial f_k} \times \frac{\partial f_k}{\partial \boldsymbol{w_j}} .$$

How do you convert into the second line, which is: $$ \frac{\partial\mathcal{L}_i}{\partial f_j} \times \frac{\partial f_j}{\partial \boldsymbol{w_j}}$$ In other words, how did $\sum$ turned into the second line?

The second question is about $k$. May I kindly ask, what is $k$ in those lines? Is it different from $j$?

Thanks in advance for your time.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 26 May 2017 - 8:54 BEST ANSWER

As written in the answer you mention the sum disappears because among all the $f_k$ only $f_j$ depends on $w_j$. Therefore, for $k \neq j$, $\dfrac{\partial f_k}{\partial w_j}=0$.

$k$ and $j$ are indices, they can be considered as "mute", because they do not represent anything. You can write the same equations with $k$ replaced by $j$ and $j$ replaced by $k$. However we usually like using $k$ for summation and $j$ as a particular index, but that is only a choice.

Question about derivations of "CS231N Backpropagation gradient"

There are 1 best solutions below

Related Questions in MATRICES

Related Questions in VECTOR-ANALYSIS

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions