In a certain sense, can "Piecewise Linear" be Interpreted as "Non-Linear"?

725 Views Asked by At

I was looking at the following function (called "ReLU") :

enter image description here

I am trying to understand why this function ("ReLU") is considered to be non-linear, when it appears to look "piecewise linear" (and even contains the term "linear" in its name):

  • Can someone please explain why the "ReLU" function is described as non-linear, when it seems to be linear in appearance? Is it possible that the individual "pieces" of the ReLU function are linear, but the entire function itself is somehow non-linear?

  • When functions are defined in "pieces" - can we still determine if the entire function is "convex" or "non-convex" - or are we forced to only label the individual pieces of the function as convex and non-convex?

Thanks!

References:

1

There are 1 best solutions below

4
On

There are two competing definitions for a linear function $f:\Bbb R\mapsto\Bbb R$, but neither of them would allow ReLU.

The first definition most are exposed to is that a linear function is polynomial of degree 1, i.e. a function in the form

$$ f(x) = ax + b $$

(as Brian Borchers points out in the comments, this type of function is usually called affine beyond elementary algebra to avoid conflict with the second definition below)

The second definition comes from linear algebra, and is a restriction on this class of functions. We say a function is linear iff

$$ f(x+\alpha y) = f(x) + \alpha f(y) $$

for $x,y$ from some vector space and $\alpha$ from the underlying field. In this case, the vector space and the underlying field are both $\Bbb R$, and this implies the form

$$ f(x) = ax $$

for all such functions.

Any attempt to put ReLU into this form will result in a function which is only valid on a half-line. To agree with the left half-line, we need $f(x) \equiv 0$; for the right half-line, we need $f(x) = x$.

Because there is a partition of the line into intervals such that ReLU is linear (in both senses) on the interior of each interval, we say that ReLU is piecewise linear. Obviously, any linear function is piecewise linear, but the same is not true in reverse, as we see here with ReLU.

So, to directly answer your question, "piecewise linear" should not be interpreted as "nonlinear" in general, but there are piecewise linear functions which are nonlinear (and those which are linear).


As with linearity, we can discuss convexity in both global and piecewise contexts.

A function is called convex on some domain when every pair of points $x,y$ in that domain has

$$ f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta)f(y) $$

for all $\theta\in [0,1]$.

Linear functions (in both senses) are convex, and so piecewise linear implies piecewise convex. Thus, ReLU is (at least) piecewise convex. However, we can directly verify that ReLU is globally convex as well by substituting the definition into the above inequality.