Recently, while experimenting with various sigmoid functions, I noticed something I found rather odd.
One well-known non-elementary sigmoid function is the error function, $$ f(x) := \text{erf}(x) = \frac{2}{\sqrt{\pi}}\int_0^x{\exp{(-t^2)}}{dt} $$ The following is another sigmoid function, which, unlike the error function, is also a relatively simple elementary function: $$ g(x) := \text{sgn}(x) \sqrt{1 - \exp{(-ax^2)}} $$ Choose $a=4/\pi$, so that $f'(0) = g'(0) = 2/\sqrt{\pi}$. Then as this Desmos graph shows, $g(x)$ is actually a surprisingly good approximation of $f(x)$. In fact, both the absolute & relative error are less than 1 part in 100: $$ \max_{x\in\mathbb{R}}{|g(x)-f(x)|} \approx 0.0063 \\ \max_{x\in\mathbb{R}}{\left|\frac{g(x)}{f(x)}-1\right|} \approx 0.0070 $$ Admittedly, there are much better elementary approximations of $\text{erf}(x)$ out there. (In particular, Wikipedia cites an approximation credited to Sergei Winitzki, which looks quite similar in form to $g(x)$.) However, many of these approximations are calibrated with several seemingly arbitrary parameters.
My question is this: is there a reason why, despite its simplicity, $g(x)$ is still a decent approximation of $\text{erf}(x)$? Perhaps some underlying relationship or aspect of similarity between the two functions? Some way of understanding one in terms of the other, that brings to light the similarity between them?
One thing I thought to investigate was the relationship between the derivatives $f'(x)$ and $g'(x)$ of these functions. From the definition of $\text{erf}(x)$, we have $$ f'(x) = \sqrt{\frac{4}{\pi}} \exp{(-x^2)} $$
Differentiating $g(x)$, meanwhile, gives $$ g'(x) = \text{sgn}(x) \frac{\tfrac{4}{\pi}x\exp{(-\tfrac{4}{\pi}x^2)}}{\sqrt{1 - \exp{(-\tfrac{4}{\pi}x^2)}}} $$
The expressions certainly seem to share several "puzzle pieces" in common: the coefficient $4/\pi$, exponentials of negated squares, etc. (The presence of $\text{sgn}(x)$ is somewhat immaterial, as one can just as easily ignore it & consider both functions on the domain $x>0$ instead.) I had thought that maybe I could substitute some piece of the expression for $g'(x)$ with something else which it approximates well, and then simplify or limit that expression to $f'(x)$, thereby understanding the similarity between $g(x)$ and $f(x)$ in terms of a relationship between a more basic piece of their derivatives. However, I am unable to see any way to do this.
A final aside: I realize there is some inherent subjectivity to this question. In particular, what makes an approximation "close", or an expression "simple", certainly depends quite heavily on the context. This is my first question, and I do hope it is not too subjective for this forum--if it is, please accept my apologies.