Formula for the derivatives in backpropagation for neural networks?

91 Views Asked by Bumbble Comm At 16 May 2026 - 1:49

I'm following Andrew Ng's Coursera course on machine learning. In week 5 he introduces backpropagation and the $\delta$ term which is meant to represent the "error" a neuron has.

He gives us this:

formula, where the .* represents the operation of elementwise multiplication (since the two arguments are both vectors).

He then gives a formula for the $g'(z^{(3)})$ term, which is $a^{(3)}.* (1-a^{(3)})$. He says that it's possible to prove this mathematically, but I'm unable to.

Note that $g(x)$ is the sigmoid function, $1/(1+e^{-x})$, and as such $g'(x) = \frac {e^{-x}}{(1+e^{-x})^2}$. Also note that functions in this course, when applied to vectors or matrices, are assumed to be applied elementwise.

Also note that $z^{(3)}$ = $\Theta^{(3)} a^{(3)}$. $\Theta^{(3)}$ is the matrix of weights for the 3rd layer, and $a^{(3)}$ is the activations of the units in the 3rd layer.

I hope that this is common terminology not just specific to this course, otherwise I guess answering this question is difficult because you don't know what the letters mean. I can't succintly explain the terms further...

Nonetheless, by slightly simplifying $g'(z^{(3)})$ I get: $$\frac {e^{ -\Theta^{(3)} a^{(3)} }} {1 + 2e^{ -\Theta^{(3)} a^{(3)} } + e^{ -2\Theta^{(3)} a^{(3)} } } $$

Is it feasible to get to $a^{(3)}.* (1-a^{(3)})$ from here?

Original Q&A

Formula for the derivatives in backpropagation for neural networks?

Related Questions in CALCULUS

Related Questions in LINEAR-ALGEBRA

Related Questions in COMPUTER-SCIENCE

Related Questions in MACHINE-LEARNING

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions