I have a loss function which I got from here. Its called the $P4$ metric;
$$ P4 = \dfrac{4*TP*TN}{(4*TP*TN + (TP+TN)*(FP+FN)} $$
Where;
$TP =$ True Positive
$TN =$ True Negative
$FP =$ False Postive
$FN =$ False Negative
From what I was told this is not differentiable.
- How does one come to that conclusion?
- If so is there a way to create a differentiable approximation to this function that could be used as a loss function to train a neural network?
Would anyone be able to help me in this matter please.
Thanks & Best Regards AMJS
The loss function you gave is continuous (and differentiable) as a function of the true/false positive/negative variables, except when these are all zero.
However, in order to categorize your neural network's output as a true or false positive or negative, you are probably discretizing your network's output to 0 or 1, using something like argmax or integer rounding. This discretization is, naturally, not continuous or differentiable.
If you consider the loss of the network on a certain training data set as a function of the network parameters, then the resulting function is almost certainly not everywhere continuous or differentiable. This is probably what you've been told. And even where it is continuous and differentiable, the gradient will be zero making it useless for training.
I believe the typical approach is to apply an appropriate loss function to the network output directly, before discretizing the network output into classification labels.