Derivation of the Backpropagation Algorithm

83 Views Asked by At

I'm studying the Backpropagation algorithm and I want to derive it by myself. Therefore, I've constructed a very simple network with one input layer, one hidden layer and one output layer. You can find the details in the graphic.

The network can be found here:

$t$ is the true output, $i$ is the input, $z_{hi}$ and $w_{ij}$ are the weights. Moreover I have the activation function $\phi(x)$.

I think that I have understood the algorithm more or less, so I've started with the following error function:

$$ E = \frac{1}{2}\sum_{j=1}^2{(t_j-o_j)^2} $$

That function should be correct, isn't it? The next step is to calculate the partial derivative: $$ \frac{\partial E }{\partial w_{ik}} = \frac{\partial E }{\partial o_k}\cdot \frac{\partial o_k }{\partial w_{ik}} = -(t_k-o_k) \cdot \frac{\partial o_k }{\partial w_{ik}} =-(t_k-o_k) \cdot \phi'(\sum_{i=1}^2 w_{i,k}\cdot a_k ) \cdot a_k $$

For me that seems to be correct, but is that right up to here (Especially the formal math aspects)? I'm asking, because i have so problems with the next steps (derivating to the z's...

Thanks for helping!