Derivatives on hidden layers in backpropagation (ANNs)

1.9k Views Asked by Bumbble Comm At 27 Mar 2026 - 11:13

I'm working on understanding all the math used in artificial neural networks. I have gotten stuck at calculating the error function derivatives for hidden layers when performing backpropagation.

On page 244 of Bishop's "Pattern recognition and machine learning", formula 5.55. The derivative of the error function for a hidden layer is given using a sum of derivatives over all units to which it sends connections.

$$ \frac{\partial E_n}{\partial a_j} = \sum_k \frac{\partial E_n}{\partial a_k} \frac{\partial a_k}{\partial a_j}$$

I know the chain rule. If $a_j$ goes into only one other node, we can apply the chain rule to separate the parts. But what is the intuition behind summing these values for all nodes if the output goes into multiple nodes?

Thanks

Original Q&A

Derivatives on hidden layers in backpropagation (ANNs)

Related Questions in DERIVATIVES

Related Questions in MACHINE-LEARNING

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions