Differentiable "Signum" Function or "Step" Function for Gradient Descent

634 Views Asked by At

We are working on an optimization problem that involves using thresholds in a real-world decision algorithm. As of right now, my colleague and I are stumped on finding a differentiable mathematical function that maps any number less than $0$ to $0$ (or $-1$) and any number greater than or equal to $0$ as $1$. The sign operation (https://en.wikipedia.org/wiki/Sign_(mathematics)) is non-differentiable and in our case, not usable. Other attempts will return numbers extremely close to $1$ or $0$ but are still non-differentiable after applying a floor or ceiling operation. If any such function or combination of functions exist, a point in the right direction here would be immensely appreciated.

2

There are 2 best solutions below

2
On BEST ANSWER

You will never find a differentiable function $f$ such that $f(x) = 0$ for $x<0$ and $f(x) = 1$ for $x\geq 0$. What you may be interested in are sigmoid functions, which approximate a function with such properties and are often used in, say, machine learning for exactly your purpose of a differentiable approximation to the function you want.

One example is the logistic function $$f(x) = \frac{1}{1+e^{-x}},$$ often used in machine learning. Note that we can also define the family of functions $$f_a(x) = \frac{1}{1+e^{-ax}}$$ for $a\geq 0$. Then as $a\to\infty$, $f_a$ converges pointwise to a function that is 0 for $x< 0$ and 1 for $x\geq 0$.

If instead you want a function that is $-1$ for $x<0$ and $1$ for $x\geq 0$, then you can try the hyperbolic tangent $$ g_a(x) = \tanh(ax),$$ again for $a>0$. Then again, as $a\to\infty$, $\tanh (ax)$ converges pointwise to what you want.

Note that there is a slightly subtlety that I have deliberately left out above: for all $a$, we actually have $f_a(x) = 1/2$ and $g_a(x) = 0$, which isn't exactly what we want. But depending on your application, it is numerically unlikely that the argument of $f_a$ or $g_a$ will ever be exactly zero, so this usually is not an issue. If this is an issue, then you'll have to explicitly define $f_a(0) = 1$ or $0$ (respectively, $g_a(0) = -1$ or $1$) depending on your application.enter image description here

Below, I've plotted $f_a(x)$ in red and $g_a(x)$ in green, for $a = 3$, so you can get a sense of the behavior of these functions.

0
On

The logistic function: $\frac{1}{1+e^{-kx}}$ is a commonly used approximation. You can make $k$ large to approximate you're desired function increasingly well.