Why does sigmoid function use e instead of another constant?

Question

Why does sigmoid function use e instead of another constant?

1.8k Views Asked by Bumbble Comm At 01 Apr 2026 - 3:36

I know this has already been asked, but I'm just curious why the sigmoid function

P(y) = 1 / ( 1 + e ^ -y )

does use e and not pi for instance, or 1. Does it have a better shape (or some special characteristics) with e than with another constant, or is it just that the first guy who found the formulae decided arbitrarily to use e ?

Update

Thanks for all your answers. Make sense to choose e then. I also made some graphs for anyone interested in how the function will look with other numbers: http://lingtalfi.com/img/math/sigmoid.png

Where we can clearly see that the greatest the number, the more "binary" the curve looks.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 21 Apr 2019 - 10:52

One convenient property of the sigmoid function $P$ is that $$ \tag{1} P'(y) = P(y) (1 - P(y)), $$ as you can easily check. Thanks to this property, the formula for the gradient works out nicely in logistic regression. Formula (1) is quite elegant.

**Bumbble Comm** · Accepted Answer

Choosing a different base would just squash the graph of the function uniformly in the horizontal direction, since $$ a^x = e^{x\cdot \ln(a)}. $$

The exponential function with base $e$ is widely considered to be the simplest exponential function. It has nice properties that no other base has, mainly:

The function $e^x$ is its own derivative.
It has a particularly simple power series expansion: $$ e^x = 1 + x + \frac12 x^2 + \frac16 x^3 + \cdots + \frac1{n!}x^n + \cdots $$ All of the coefficients are rational numbers. If the base had been something intuitively "nicer" than $e$, such as an integer, the coefficients would need to be irrational.

For this reason, most mathematicians will pick $e^x$ when they need an exponential function and have no particular reason to pick one base over another. (With the exception of computer scientists and information theorists, who sometimes prefer $2^x$).

In case of the logistic function in particular, choosing $e$ as the base means that for large negative $y$ we have $P(y)\approx e^y$ and so the derivative of $P(y)$ is very close to $P(y)$ itself. This makes it simple to contrast logistic growth with unbounded exponential growth $y\mapsto a\cdot e^y$.

Why does sigmoid function use e instead of another constant?

Update

There are 2 best solutions below

Related Questions in ANALYSIS

Related Questions in EXPONENTIAL-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions