How is logistic loss and cross-entropy related?

32.2k Views Asked by At

I found that Kullback-Leibler loss, log-loss or cross-entropy is the same loss function. Is the logistic-loss function used in logistic regression equivalent to the cross-entropy function? If yes, can anybody explain how they are related?

Thanks

2

There are 2 best solutions below

6
On BEST ANSWER

The relationship between Cross-entropy, logistic loss and K-L divergence is quite natural and immersed in the definition itself.

Cross-entropy is defined as: \begin{equation} H(p, q) = \operatorname{E}_p[-\log q] = H(p) + D_{\mathrm{KL}}(p \| q)=-\sum_x p(x)\log q(x) \end{equation} Where, $p$ and $q$ are two distributions and using the definition of K-L divergence. $H(p)$ is the entropy of p. Now if $p \in \{y,1-y\}$ and $q \in \{\hat{y}, 1-\hat{y}\}$, we can re-write cross-entropy as: \begin{equation} H(p, q) = -\sum_x p_x \log q_x =-y\log \hat{y}-(1-y)\log (1-\hat{y}) \end{equation} which is nothing but logistic loss. Further, log loss is also related to logistic loss and cross-entropy as follows:

Expected Log loss is defined as follows: \begin{equation} E[-\log q] \end{equation} Note the above loss function used in logistic regression where q is a sigmoid function. Excess risk for the above loss function is defined as follows: \begin{equation} E[\log p - \log q ]=E[\log\frac{p}{q}]=D_{KL}(p||q) \end{equation} Notice that the K-L divergence is nothing but the excess risk of the log loss and K-L differs from Cross-entropy by a constant factor (see the first definition). One important thing to remember is that we usually minimize the log loss instead of the cross-entropy in logistic regression which is not perfectly OK but it is in practice.

0
On

yes they are related.
the cross entropy used in logistic regression is derived from the Maximum Likelihood principle (or equivalently minimise (- log(likelihood))). see section 28.2.1 Kullback-Liebler divergence:

Suppose ν and µ are the distributions of two probability models, and ν << µ. Then the cross-entropy is the expected negative log-likelihood of the model corresponding to ν, when the actual distribution is µ