If $C = \frac{1}{2}\sum_{j}(y_j - a_{j}^{L}) ^2$ then why is: $\frac{\partial C}{\partial a_{j}^{L}} = (a_{j}^{L} - y_j)$?

20 Views Asked by Bumbble Comm At 17 May 2026 - 11:45

Around the phrase in the book of http://neuralnetworksanddeeplearning.com/chap2.html

which obviously is easily computable.

There is $C = \frac{1}{2}\sum_{j}(y_j - a_{j}^{L}) ^2$

Then why is: $\frac{\partial C}{\partial a_{j}^{L}} = (a_{j}^{L} - y_j)$?

I thought it would be: $\frac{\partial C}{\partial a_{j}^{L}} = (y_j - a_{j}^{L} )$

The answer in the book would flip the sign, wouldn't it?

Is the flip between $a_{j}^{L}$ and $y_j$ a typo or intentional?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 05 Feb 2019 - 4:11 BEST ANSWER

The book is correct. When you do the chain rule, you need to multiply by the derivative of $y-a$ with respect to $a$ which is $-1$. So the derivative is $-(y-a) = a-y$

If $C = \frac{1}{2}\sum_{j}(y_j - a_{j}^{L}) ^2$ then why is: $\frac{\partial C}{\partial a_{j}^{L}} = (a_{j}^{L} - y_j)$?

There are 1 best solutions below

Related Questions in ALGEBRA-PRECALCULUS

Related Questions in DERIVATIVES

Trending Questions

Popular # Hahtags

Popular Questions