Why is minimizing the Heaviside step function a combinatorial problem?

825 Views Asked by At

I was going through this lecture on ML Youtube @ 59:28 and the Heaviside step function as a loss function was introduced and two things were mentioned:

  • The function is not convex, so stay away if possible.

  • The minimizing operation by taking the derivative and equal it to $0$ becomes a combinatorial problem.

My questions are:

  1. I have the notion of convex function but a good explanation of convexity with respect to optimization shall be helpful.

  2. Why is the problem of minimization a combinatorial problem?

1

There are 1 best solutions below

2
On BEST ANSWER

Function is $f$ is called convex if for any $x_1,x_2$ and $0<t<1$: $$tf(x_1)+(1-t)f(x_2)\ge f(tx_1+(1-t)x_2)$$ See wikipedia.

For Heaviside function we can select $x_1=-1$, $x_2=2$ and $t=1/2$, then $$ \frac12\theta(-1)+\frac12\theta(2)=\frac12 <\theta\left(-\frac12+\frac22\right)=\theta\left(\frac12\right)=1. $$ So Heaviside function is not convex.

But the most serious problem lies in fact that $\theta'(x)=0$ almost everywhere. So if you are sitting at some point, there is no way for you to determine in which way to go. So basically you need to take all your Heaviside functions and check what happens to your loss functions if each of them is zero or one. Thus if you have $n$ Heaviside functions, you need to check $2^n$ scenarios, which is a nasty combinatorial problem.