So I’m studying neural networks, and I came across these activation networks. Of course I could just ignore the mathematical specifics but they seemed interesting, especially since I like math. So does someone know what the formulas to these functions are? I did find an article saying ReLU is $\max(0, x)$ but I thought that’s only in programming? Is it a thing in math too?
What are the formulas for ReLU, Tanh, and sigmoid activation functions
111 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
Assume you have a parameter $x$ in a node of a network.
The ReLU activation function is just a piecewise function that returns $0$ if $x < 0$ or returns $x$ if $x \geq 0$. Or we could say that it's a maximum function over a set with elements $0$ and $x$. We would write that as $\max\{0, x\}$. For example, if $x = -1.733928$, then $\max\{0, -1.733928\} = \{0\}$, since $0 > -1.733928$. And the $\max$ function is used in mathematics as well.
The $\tanh$ activation function is defined as $$\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$$
The $\mathrm{sigmoid}$ function, denoted as $\sigma$ is defined as $$\sigma(x) = \frac{1}{1 + e^{-x}}$$
The domains of $\tanh$ and $\sigma$ is $\mathbb{R}$ while their range is $(-1, 1) $ and $(0, 1)$ respectively.
ReLU means "rectified linear unit" ; it's just as you wrote $\text{ReLU}(x)=\max(0,x)$ and is used to discard negative values. Often in math you can define, for any real function, its "positive part" and its "negative part" with $f^+(x)=\max(0,f(x)$ and $f^-(x)=\max(0,-f(x))$, so that finally $f=f^+-f^-$ and $|f|=f^++f^-$. the name "ReLU" is more specifically used in AI.
Tanh is "hyperbolic tangent" and is given by $$\tanh(x)=\dfrac{e^x-e^{-x}}{e^x+e^{-x}}$$ This is a function which limits at infinity are $-1$ and $1$ and which is increasing.
The sigmoïd is an increasing function whose limits are $0$ and $1$, often used in fuzzy logic. Its definition is $$\sigma(x)=\dfrac{1}{1+e^{-x}}$$