I know this has already been asked, but I'm just curious why the sigmoid function
P(y) = 1 / ( 1 + e ^ -y )
does use e and not pi for instance, or 1. Does it have a better shape (or some special characteristics) with e than with another constant, or is it just that the first guy who found the formulae decided arbitrarily to use e ?
Update
Thanks for all your answers. Make sense to choose e then. I also made some graphs for anyone interested in how the function will look with other numbers: http://lingtalfi.com/img/math/sigmoid.png
Where we can clearly see that the greatest the number, the more "binary" the curve looks.
Choosing a different base would just squash the graph of the function uniformly in the horizontal direction, since $$ a^x = e^{x\cdot \ln(a)}. $$
The exponential function with base $e$ is widely considered to be the simplest exponential function. It has nice properties that no other base has, mainly:
For this reason, most mathematicians will pick $e^x$ when they need an exponential function and have no particular reason to pick one base over another. (With the exception of computer scientists and information theorists, who sometimes prefer $2^x$).
In case of the logistic function in particular, choosing $e$ as the base means that for large negative $y$ we have $P(y)\approx e^y$ and so the derivative of $P(y)$ is very close to $P(y)$ itself. This makes it simple to contrast logistic growth with unbounded exponential growth $y\mapsto a\cdot e^y$.