Fitting an activation function with tanh

872 Views Asked by At

I'm trying to fit an activation function with tanh via: F = aa3 + aa2 * np.tanh(aa0 * x + aa1)

However, the original data (blue) is peculiar in that it needs an asymmetric curvature which the fit (red) is unable to grasp - sharp at the base and loose at the top: tanh fit

Is there a better function you might recommend? Any general advice to think on going forward would be incredibly appreciated!

2

There are 2 best solutions below

5
On BEST ANSWER

To obtain an asymmetric curvature one can use a function of this kind (A generalized logistic equation) : $$y(x)=y_{min}+\frac{y_{max}-y_{min}}{1+a\:e^{-b\,x}+\alpha\:e^{-\beta\, x}}$$

enter image description here

2
On

Finding the slope of the linear portion of the curve is another problem. That is why I post a second answer.

FIRST METHOD (arduous)

One use a non-linear regression software to fit the equation $\quad y(x)=y_{min}+\frac{y_{max}-y_{min}}{1+a\:e^{-b\,x}+\alpha\:e^{-\beta\, x}}\quad$ to the data. I didn't it in order to save time. Instead I used your result of fitting : $y_{min} = 5.99\:,\: y_{max} = 11.99\:,\: a_1 = 0.006\:,\: b_1 = 3.39\:,\: a_2 = 1.66\:,\: b_2 = 0.67$

Then one compute the derivatives $\frac{dy}{dx}$ and $\frac{d^2y}{dx^2}$

Solving the equation $\frac{d^2y}{dx^2}=0$ leads to the abscissa of the inflexion point. Then one compute the ordinate and the slope with $\frac{dy}{dx}$.

Thanks to a math software the result is : $$\begin{cases} x_i=0.733087 \\ y_i=8.965789 \\\frac{dy}{dx}=y'_i=1.006941\end{cases}$$ The equation of the tangent at inflexion point is : $y(x)=y_i+(x-x_i)y_i'$ as drawn on the next figure :

enter image description here

Note : One can see that the tangent is a very good fit only for a few points close to the inflexion point. This is not a good fit for the linear portion of the curve. So the above value of the slope can be condidered as a rough approximate.

SECOND METHOD :

Let $\begin{cases} y_m=y_{min}+C(y_{max}-y_{min}) \\ y_M=y_{max}-C(y_{max}-y_{min}) \end{cases}\quad$ with for example$\quad C=0.2\quad \begin{cases}y_m=7.19 \\y_M=10.79 \end{cases}$

One determines the corresponding range of $k$ :

enter image description here Result : $\quad k_{min}=490\quad,\quad k_{max}=530$ .

Then one proceed to a linear regression on the range $k_{min}\leq k\leq k_{max}$ . $$y=Ax+B$$ enter image description here

Result : $\quad A=0.892462 \quad,\quad B= 8.292967$ .

The fitted staight line is drawn on next figure.

enter image description here

One can see that the fitting is better than above on a larger range. The drawback is that the method is a bit subjective because the result is slightly dependant of the factor $C$ which has to be reasonably chosen ( about $0.1<C<0.3$ ).

Note :An idea to make it less subjective should be to compute $x_m$ and $x_M$ instead of choosing the factor $C$. This might be done thanks to the method given pages 17-19 in https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf . Since I didn't tested it with your data I cannot recommend it.