Around the phrase in the book of http://neuralnetworksanddeeplearning.com/chap2.html
which obviously is easily computable.
There is $C = \frac{1}{2}\sum_{j}(y_j - a_{j}^{L}) ^2$
Then why is: $\frac{\partial C}{\partial a_{j}^{L}} = (a_{j}^{L} - y_j)$?
I thought it would be: $\frac{\partial C}{\partial a_{j}^{L}} = (y_j - a_{j}^{L} )$
The answer in the book would flip the sign, wouldn't it?
Is the flip between $a_{j}^{L}$ and $y_j$ a typo or intentional?
The book is correct. When you do the chain rule, you need to multiply by the derivative of $y-a$ with respect to $a$ which is $-1$. So the derivative is $-(y-a) = a-y$