Cost function of logistic regression: $0 \cdot log(0)$

925 Views Asked by At

Based on Andrew Ng's Coursera machine learning course, logistic regression has the following cost function (probably among others): $$ \operatorname{cost}(h(x),y) = \begin{cases} -\log(h(x)), & \text{if $y$ = 1} \\ -\log(1-h(x)), & \text{if $y$ = 0} \end{cases} $$ where $y$ is either $0$ or $1$ and $h(x)$ is a sigmoid function returning inclusively between $[0, 1]$.

According to the class, this can be simplified to the following form: $$\operatorname{cost}(h(x),y)=-y \cdot \log(h(x))-(1-y) \cdot \log(1-h(x))$$

When both $y$ and $h(x)$ are $1$ then the second part of the formula becomes $(0) \cdot log(0)$ which is $0 \cdot (-\infty)$ which is indeterminate.

My question is, providing what I wrote above is correct, how this can still work? I see numerous working Matlab/Octave implementations which don't handle differently this edge case.

1

There are 1 best solutions below

1
On

The only way to get $h(x)$ to be $0$ is when $x = -\infty$. But inf is region not a value no matter which value of x you take there's always a value greater than it(or smaller than it in case of -inf). So the h(x) value never equals 0. The catch here is as the x tends to get closer to -inf the value of h(x) gets closer to 0, but not the exact value of 0.