Approximating for the Error function $\text{erf}(x)$ through an Hyperbolic tangent function $\text{tanh}\left(\dfrac{4x}{4-x^2}\right)$

Question

Approximating for the Error function $\text{erf}(x)$ through an Hyperbolic tangent function $\text{tanh}\left(\dfrac{4x}{4-x^2}\right)$

232 Views Asked by Bumbble Comm At 27 Mar 2026 - 6:00

Approximating for the Error function $\text{erf}(x)$ through an Hyperbolic tangent function $\text{tanh}\left(\dfrac{4x}{4-x^2}\right)$

I was plotting some functions and I found that the function $$f(x) = \begin{cases} -1,\quad x\leq -2\\ 1, \quad x\geq 2 \\ \text{tanh}\left(\dfrac{4x}{4-x^2}\right),\, -2<x<2\end{cases}$$ "looks" very similar to the graph of the Error function as it is shown in Wolfram-Alpha:

But looking into the wikipedia page for the Error function this approximation is not listed, so I guess that regardless from the similarity in the plot, $f(x)$ it is considered as a "bad approximation":

Why it is considered a poor approximation?

Also, a simpler version of the Hyperbolic tangent function could fit even better as approximation: $$g(x) = \text{tanh}\left(\dfrac{11}{9}x\right)$$

But no relation with the Hyperbolic tangent function is listed in Wikipedia, so

Why hyperbolic tangents are considered bad approximations for the error function?

Here I left the plots in Desmos:

Added later (after some answers)

After 2 interesting answers, I got the idea of testing the series expansion of $\tanh^{-1}(\text{erf}(x))$ shown in Wolfram-Alpha, and just the first 2 terms makes a simple approximation than works quite good: $$f(x)=\tanh\left(\frac{2}{\sqrt{\pi}}\left(x+\frac{(4-\pi)x^3}{3\pi}\right)\right)$$

Here you could see it in Desmos where the maximum aplitude difference is lower than $0.0007$. Also note that don't requires to be defined as a piecewise function.

Does this approx. good enough for approximating probabilities?

Even since after these 2 first terms the Taylor expansion start to converge more slowly, by sacrificing accuracy near $x=0$ (since is symmetric), one could find approximations that reduce the maximum amplitude differences, and also, for making it having fewer terms I have choosen the following value (arbitrarily by trial and error):

$$g(x)=\tanh\left(\frac{2}{\sqrt{\pi}}\left(x+\frac{\pi}{35}x^3\right)\right)$$

which keeps the amplitude differences below $0.0005$.

I don't know How to measure if it will made too much mistakes if I use $g(x)$ instead of the Standard Gaussian CDF for taking probabilities, What do you think?

my last attempt

By trial and error I found that: $$z(x)=\tanh\left(\frac{2}{\sqrt{\pi}}\left(x+\frac{11}{123}x^3\right)\right)$$

keeps the difference $|\text{erf}(x)-z(x)|<0.00036$. Maybe someone could find an optimal $\hat{a}$ such it makes the best fit possible for $\text{erf}(x)$ through $\tanh\left(\frac{2}{\sqrt{\pi}}\left(x+\hat{a}x^3\right)\right)$

Also I compared in Wolfram-Alpha using $z(x)$ for taking probabilities of the standard Gaussian distribution, and the max mistake looks its below $0.018\%$, quite accurate!.

Original Q&A

There are 3 best solutions below

Bumbble Comm On 26 Oct 2023 - 4:14

Your idea is good but I think that we can do a bit better using it.

Starting from scratch, if you want to write $$\text{erf}(x)\sim\tanh \left(\frac{a x}{b+c x+d x^2}\right)$$ use a series expansion around $x=0$.

For sure, $c=0$ is a requirement. $$\text{erf}(x)-\tanh \left(\frac{a x}{b+d x^2}\right)=$$ $$x \left(\frac{2}{\sqrt{\pi }}-\frac{a}{b}\right)+x^3 \left(\frac{a^3+3 a b d}{3 b^3}-\frac{2}{3 \sqrt{\pi }}\right)+O\left(x^5\right)$$ which gives $$b=\frac{\sqrt{\pi } }{2}a \qquad \text{and} \qquad d=-\frac{(4-\pi ) }{6 \sqrt{\pi }}a$$ So, from Taylor, $$\text{erf}(x)\sim \tanh\left(\frac {6 \sqrt{\pi } x } {3 \pi -(4-\pi ) x^2 } \right)$$ $$\int_{-\pi}^{+\pi} \Big( \text{erf}(x)-\tanh\left(\frac {6 \sqrt{\pi } x } {3 \pi -(4-\pi ) x^2 } \right) \Big)^2\,dx=5.52\times 10^{-5}$$

The error of your formula is $$\left(\frac{2}{\sqrt{\pi }}-1\right) x$$ to be compared to $$\frac{\left(128-40 \pi -\pi ^2\right) }{45 \pi^{5/2}}x^5$$

The one I wrote gives a maximum absolute error of $0.0060$

Doing the same with the simplest formula, $$\text{erf}(x)\sim\tanh \left(\frac{2 x}{\sqrt{\pi }}\right) $$ which is very close to the $\frac {11}9$ that you proposed.

The advantage of all of that is that we can approximate* the inverse of the error function in a very simple manner..

Edit (just for the fun)

In the same spirit, the next level would be $$\tanh\left(\frac{20 \sqrt{\pi } x(a+ b x^2)}{c+d x^2+f x^4 } \right)$$ where

$$a=-2688 \pi +840 \pi ^2+21 \pi ^3$$ $$ b=5632-2688 \pi +252 \pi ^2+11 \pi ^3$$ $$c=-26880 \pi ^2+8400 \pi ^3+210 \pi ^4$$ $$ d=92160 \pi -47040 \pi ^2+5040 \pi ^3+180 \pi ^4$$ $$f=-36864+30720 \pi -9184 \pi ^2+880 \pi ^3+39 \pi ^4$$

This gives for the same norm $3.37\times 10^{-7}$ ($164$ times smaller than the previous) and a maximum error of $0.0004$ (around $x=2$) and the error is $\sim \frac {x^9}{10000}$.

Bumbble Comm On 26 Oct 2023 - 4:56

The approximation is bad because the tails are totally different.

The standard normal density is proportional to $e^{-x^2}$. The hyperbolic tangent, whose derivative is the square of the hyperbolic secant, is proportional to $(e^x + e^{-x})^{-2}$. For "large" $x$, this means the latter will be about $e^{-2x}$, but the former will be $e^{-x^2} \ll e^{-2x}$. As a result, the tails are much heavier for a density based on the hyperbolic tangent.

Let's denote $F(x) = \operatorname{erf}(x)$ and $G(x) = \tanh \frac{2x}{\sqrt{\pi}}$. Then $$f(x) = F'(x) = \frac{2}{\sqrt{\pi}}e^{-x^2}, \quad g(x) = G'(x) = \frac{2}{\sqrt{\pi}} \operatorname{sech}^2 \frac{2x}{\sqrt{\pi}}.$$

Here is a plot of $f/g$: The error is between $1$ and $1.040703873959\ldots$. Note that it is always an overestimate; i.e., $F > G$ for all $x > 0$. You might look at this and think, "a maximum of 4% error is not that bad."

But here is a plot of $f/g$: And this is much, much worse. As mentioned earlier, the tails are not comparable, which is why $\lim_{x \to \infty} \frac{f}{g} = 0$. In order to be a "good" approximation, the ratio of densities should be close to $1$ across the support. To see how much better we can do, consider the Bürmann series $$B(x) = \frac{2}{\sqrt{\pi}} \operatorname{sgn}(x) \sqrt{1-e^{-x^2}} \left(1 - \frac{1}{12}(1 - e^{-x^2}) - \frac{7}{480}(1 - e^{-x^2})^2 - \frac{5}{896} (1 - e^{-x^2})^3 - \cdots \right)$$ for which the first four terms yields the following plot of $F/B$: And you can easily tell this is far superior. The plot of the ratio of derivatives $f/b$ is It's not perfect by any means but it is clearly superior to the hyperbolic tangent. With more terms, it will improve further.

**Bumbble Comm** · Accepted Answer

In my first answer, I tried to stay as close as possible to your initial attempf.

Restarting from scratch, what we have is $$\tanh ^{-1}(\text{erf}(x))=t\sum_{n=0}^\infty a_n\,t^{2n} \qquad \text{where}\qquad t=\frac{2 }{\sqrt{\pi }}x$$ where the first coefficients are $$\left( \begin{array}{cc} n & a_n \\ 0 & 1 \\ 1 & \frac{4-\pi }{12} \\ 2 & \frac{96-40 \pi +3 \pi ^2}{480} \\ 3 & \frac{5760-3360 \pi +532 \pi ^2-15 \pi ^3}{40320} \\ 4 & \frac{645120-483840 \pi +116928 \pi ^2-9328 \pi ^3+105 \pi ^4}{5806080} \\ \end{array} \right)$$

for which the errors $$R_n=| \text{erf}(x)-\tanh (S_n)|$$ are $$R_1\sim\frac{t^5}{8744}\qquad R_2\sim \frac{t^7}{3946}\qquad R_3\sim \frac{t^9}{45963}\qquad R_4\sim \frac{t^{11}}{1082270}$$ For the already arbitrary bounds, consider the norms $$\Phi_n=\int_{-\pi}^{+\pi} \Big( \text{erf}(x)-\tanh\left(S_n \right) \Big)^2\,dx$$ $$\Phi_1=6.3\times 10^{-7}\quad \Phi_2=4.4\times 10^{-7}\quad \Phi_3=1.7\times 10^{-7}\quad \Phi_4=1.5\times 10^{-9}$$

To improve the model, for sure, adding more terms is a solution. But, making the series as the $[2n+1,2n]$ Padé approximant $P_n$ is better. For example $$P_1=\tanh\left(t \,\frac{a_1+(a_1^2-a_2)\,t^2 } {a_1- a_2\,t^2 }\right)$$ leads to a maximum error of $0.00053$.

Edit

$$\Phi(\hat a)=\int_{-\infty}^{+\infty} \Bigg(\text{erf}(x)-\tanh\left(\frac{2}{\sqrt{\pi}}\left(x+\hat{a}x^3\right)\right)\Bigg)^2\, dx$$ is minimum for $\hat{a}=0.0896929$ and its value is $2.73 \times 10^{-7}$.

Approximating for the Error function $\text{erf}(x)$ through an Hyperbolic tangent function $\text{tanh}\left(\dfrac{4x}{4-x^2}\right)$

There are 3 best solutions below

Related Questions in PROBABILITY

Related Questions in RECREATIONAL-MATHEMATICS

Related Questions in APPROXIMATION

Related Questions in HYPERBOLIC-FUNCTIONS

Related Questions in ERROR-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions