Derivative of Mean Square Error Function with respect to output

60 Views Asked by Bumbble Comm At 22 Apr 2026 - 10:57

I'm trying to understand the gradient derivation for the back-propagation algorithm.

I'm having trouble computing the explicit derivative of the Loss Mean Square Error function with respect to the output value in a regression setting. I only have one output neuron.

Let,

$n$ be the number of training examples
$ y_i $ be the predicted target for training example $x_i$
$ t_i $ be the actual target value (from train data) for training example $x_i$
$ L_i $ be the loss for sample $i$

I'm using the following definition of the loss function,

$$ E = \frac{1}{2n} \sum_{i=1}^{n} L_i = \frac{1}{2n} \sum_{i=1}^{n} \frac{1}{2} ( y_i - t_i)^2 $$

how do I compute, $\frac{\partial E}{\partial y}$ ?

This is in a neural network setting, so $E$ is a function of $w$ the weights, in the Bishop Book equation (5.11) is, as far as I can see, the same expression except that it's not divided by $n$,

$$ E(w) = \frac{1}{2} \sum_{i=1}^n (y(x_i, w) - t_i)^2 $$ So here $y$ is a function which depends on $x_i$ and $w$, so writing,

$$ \frac{\partial E}{\partial y} $$ means deriving by a function ??

And yet Bishop does this at equation (5.19),

$$ \frac{\partial E}{\partial y_k} = y_k − t_k $$

Where $y_k$ is the output of the kth neuron and $t_k$ the actual target value, but where are the instances gone ? They've dissapeared from the equation ! $y_k$ is predicted for an input $x$ !

I don't understand the nature of $y$ and why it's legal to derive E with respect to it.

Thanks for any help.

Original Q&A

Derivative of Mean Square Error Function with respect to output

Related Questions in CALCULUS

Related Questions in NEURAL-NETWORKS

Related Questions in MEAN-SQUARE-ERROR

Related Questions in BACKPROPAGATION

Trending Questions

Popular # Hahtags

Popular Questions