I am studying neural network.
From http://neuralnetworksanddeeplearning.com/chap2.html, it says
$\delta^L_j=\frac{\partial C}{\partial z_j^L}=\sum_k \frac{\partial C}{\partial a_k^L} \frac{\partial a_k^L}{\partial z_j^L}$
Similarly,
$\delta^L_j=\frac{\partial C}{\partial z_j^L}=\sum_k \frac{\partial C}{\partial z_k^{L+1}} \frac{\partial z_k^{L+1}}{\partial z_j^L}$
$l$ is a layer, $\delta^l$ is an error vector, $a^l$ is an activation, and $z^l$ is a weighted input (from the previous layer).
I do not understand why is a partial derivative the sum of other partial derivative chains?
What I think is:
$\sum_k \frac{\partial C}{\partial a_k^L} \frac{\partial a_k^L}{\partial z_j^L} = \frac{\partial C}{\partial a_1^L} \frac{\partial a_1^L}{\partial z_j^L} + \frac{\partial C}{\partial a_2^L} \frac{\partial a_2^L}{\partial z_j^L} + ... + \frac{\partial C}{\partial a_k^L} \frac{\partial a_k^L}{\partial z_j^L} =k\frac{\partial C}{\partial z_j^L} $
How can it be:
$\delta^L_j=\sum_k \frac{\partial C}{\partial a_k^L} \frac{\partial a_k^L}{\partial z_j^L}=\frac{\partial C}{\partial z_j^L}$?
Thank you!
Is a consequence of the chain rule for several variables. Your cost function $C$ depends on several output activations $a_1,a_2,\ldots,a_m$ and each activation function $a_k$ depend on the variables $z_1,\ldots,z_n$:, $$C=C(a_1,\ldots,a_m)$$ also each activation has several inputs $z_j$ as below: $$ a_{k}=a_{k}(z_1,\ldots,z_n) $$ where $n$ is the number of inputs of the layer and $m$ is the number of outputs of the same layer. So, the cost function depends on $z_1,\ldots,z_n$ by the following composition with the activations $a_1,\ldots,a_m$: says that
$$C=C(z_1,\ldots,z_n) =C(a_{1}(z_1,\ldots,z_n),\ldots,a_{m}(z_1,\ldots,z_n))$$ this implies by the chain rule that $$\delta_j:=\frac{\partial C}{\partial z_j}=\sum_{k=1}^{m}\frac{\partial C}{\partial a_k}\frac{\partial a_k}{\partial z_j}$$ This happens for each layer $L$ , so you can write the following: $$\delta_{j}^{L}:=\frac{\partial C}{\partial z_{j}^{L}}=\sum_{k=1}^{m}\frac{\partial C}{\partial a_{k}^{L}}\frac{\partial a_{k}^{L}}{\partial z_{j}^{L}}$$