Backpropagation: why partial derivative, not full derivative?

402 Views Asked by Bumbble Comm At 28 Mar 2026 - 12:55

After studying backpropagation for neural networks, I have a question: why can't we use full derivatives for backpropagation? I understand why partial derivatives work in backpropagation. However I wonder why we cannot (or should not) use full derivatives.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 07 Oct 2021 - 10:44

It is because you ultimately want to find "in which direction" you should change the network's parameters (biases and weights), one by one, in order to minimize the loss. So you look at every parameter and tweak it slightly from its value and see if the loss increases or decreases: this is equivalent to computing

$$\frac{\partial \mathcal{L}}{\partial b^{(l)}_j} \hspace{0.5cm} or \hspace{0.5cm} \frac{\partial \mathcal{L}}{\partial w^{(l)}_{ij}}$$

where $(l)$ is the layer the bias or weight refers to.

To compute these quantities you use backpropagation (which is just the chain rule applied to neural networks) and this involves partial derivatives.

Backpropagation: why partial derivative, not full derivative?

There are 1 best solutions below

Related Questions in DERIVATIVES

Related Questions in PARTIAL-DERIVATIVE

Related Questions in GRADIENT-DESCENT

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions