Derivative of Binary Cross Entropy for Neural Network Classifier

360 Views Asked by At

I am currently following a introductory course in machine learning. I would like to use the binary cross entropy as a loss function. Having a simple neural network (2 inputs, 1 hidden layer with 3 nodes and a single output).

Based on my notes the output error is δ = (derivative of activation function)*(derivative of error function). That would make the error of output function (derivitive of sigmoid) -> σ(u)(1-σ(u))*derivative of binary cross entropy.

The result is σ(o(x^n))(1-σ o(x^n))* /frac {-t^n + o(x^n)}{o(x^n)(1-o(x^n} where t^n is the predicted output and o(x^n) the output that the neural network produced at this epoch.

This however does not seem right, can someone explain if it can be simpliefied further and how?

Is the binary cross function used anywhere else except the output node for backpropagation?