I am in trouble to know what is the loss function of a neural network. For a binary classification problem, is it mean squared error, as described in the following video :https://www.youtube.com/watch?v=5u0jaA3qAGk&t=59s or is it cross entropy as defined here http://work.caltech.edu/slides/slides09.pdf and why ?
Moreover, in case of multi classification, I think there is sth like softmax but I don't really know how it works. Could s.o. explain me properly ?
Thanks !
Softmax is the activation function, not the loss. Softmax is defined as (source):
$$\varphi(\mathbf{x})_j = \frac{e^{\mathbf{x}_j}}{\sum_{k=1}^K e^{\mathbf{x}_k}}$$
The cross entropy is also defined for multiple classes (and multiple labels):
$$E_{-x} = - \sum_{k}[t_k^x \log(o_k^x) + (1-t_k^x) \log (1- o_k^x)]$$
Where k is the class, o is the output of the classifier and t is the true label. The x superscript indicates that it belongs to x.