Abstract
What I'm generally trying to do is to squeeze the first derivative of the sigmoid function defined over infinite range into finite. I.e.
$$f(x)=\frac{1}{1+e^{-x}}$$
This function is heavily used as an activation function for neural networks. It's integral has close relation to another relu activation function which is also heavily used with neural networks:
$$\int{f(x)}{dx}=ln(1+e^{-x})-ln(e^{-x})$$
It's derivative is useful in modeling and is similar to normal distribution pdf function:
$$\frac{d}{dx}f(x)=\frac{e^{-x}}{(1+e^{-x})^{2}}=df(x)$$
It is also simple to evaluate for computer, it's derivative is expressed in terms of function itself. But $df(x)$ is defined over infinite range which makes it unsuitable for some modeling tasks where the range should be finite. Taking into account the fact that the most of the values close to zero are located outside [-10..10] range, this is shown by the following equation:
$$\int_{-\infty}^{\infty}{df(x)}{dx}-\int_{-10}^{10}{df(x)}{dx}=1-\frac{e^{10}-e^{-10}}{(1+e^{10})(1+e^{-10})}=0.00009079573740486882...$$
That means that all of the values outside [-10..10] would introduce minor effect to the shape of function when it's range is squeezed from infinite to finite.
So how do we squeeze the range? It can be done through the use of $arctanh(x)$:
That goes as an argument to $df$ function. So when approaching +/-1 arctanh would asymptotically approach +/-$\infty$ thus when used as an argument it will map the argument range from infinite to finite:
$$df(arctanh(x))=\frac{\sqrt{1-x^2}}{(1+\frac{\sqrt{1-x^2}}{x+1})^2(x+1)}$$
while the function doesn't even look similar to what was originally intended it is easy to get the shape close through a factor before arctanh(x):
$$df(arctanh(x)*6)$$
yields to:
thus getting the final equation equal to:
$$df(n*arctanh(x))=\frac{(1-x^2)^{\frac n2}}{(1+\frac{(1-x^{2})^{\frac n2}}{(x+1)^n})^2(x+1)^n}=F(x,n).$$
It's interesting to see how $4*F(x,\pi)$ is similar to $\frac{sin(x\pi+\frac{\pi}{2})}{2}+\frac 12$:
Which could essentially serve as yet another way to approximate sine.
Problem
While playing with $F(x,n)$ I've noticed that it's hard to find values of $n$ that suffice some condition, for instance I want to find $n$ such that it best approximates $\frac{sin(x\pi+\frac{\pi}{2})}{2}+\frac 12$. Let's define an error function of an approximation:
$$err(x, n) = (4·df(n·arctanh(x))-\frac 12·sin(x\pi+\frac{\pi}{2})-\frac 12)^2$$
or
$$err(x, n) = (\frac{(1-x^2)^{\frac n2}}{(1+\frac{(1-x^{2})^{\frac n2}}{(x+1)^n})^2(x+1)^n}-(\frac{sin(x\pi+\frac{\pi}{2})}{2}+\frac 12))^2.$$
Then we would need to find $n$ such that:
$$\frac{d}{dn}\int_{-1}^{1}{err(x, n)}{dx}=0.$$
Please note that $n$ is real here
This is where things start getting messy, an integral over $F(x, n)$ doesn't seem to have an analytical (closed) form nor it is clear how to approach differentiation of such integral due to different integration and differentiation variables. I don't understand how to differentiate such anintegral, so thought it would be better to ask here?
Thank you in advance!







