Why is $(1-\frac{1}{m})^{kn} \approx e^{-kn/m}$? I mean, when do we know when $e$ should be raised to a negative exponent? and how do we even know it's e?
Little background info: We have a bit vector of size $m$, $n$ elements, $k$ hash functions, and $\frac{1}{m}$ is the probability that a specific bit is flipped to $1$ (assuming independence at perfect randomness). The above equation is for a Bloom filter. It tells us the probability that a specific bit in the bit vector (size=$m$), after inserting $n$ element with $k$ hash functions, is still $0$. It makes sense, that it should approach $0$ and that it wouldn't make sense if it was raised to a non-negative power, but I still need to know how mathematicians come up with such conclusions? And how to they even know it is e?
SOURCE: scribe notes from CS 6550 – Design and Analysis of Algorithms, Dana Randall on Bloom Filters.
Let m be a positive integer having nothing to do with your question. By binomial expansion, $$(1-\dfrac1m)^m=1-m(\dfrac1m)+\dfrac{m(m-1)}2(\dfrac1m)^2-\dfrac{m(m-1)(m-2)}6(\dfrac1m)^3+...+(-1)^m\dfrac{m!}{m!}(\dfrac1m)^m$$ Each term may written as $(-1)^k\dfrac{m(m-1)\cdots(m-k+1)}{k!}(\dfrac1m)^k$, so if we let $m\to\infty$, the expression becomes $$1-1+\dfrac12-\dfrac16+...+(-1)^m\dfrac1{m!}$$
Let the j-th term be $t_j$, then $|t_j|$ decreases monotinally and $\lim_{j\to\infty}t_j=0$. Therefore, by the altering series test, $1-1+\dfrac12-\dfrac16+...+(-1)^m\dfrac1{m!}$ converges. In fact, the series formed by $|t_j|$ is $$(1+\dfrac1m)^m=1+m(\dfrac1m)+\dfrac{m(m-1)}2(\dfrac1m)^2+\dfrac{m(m-1)(m-2)}6(\dfrac1m)^3+...+\dfrac{m!}{m!}(\dfrac1m)^m,$$ letting $m\to\infty$, this expression tends to $e$. Notice that $$(1-\dfrac1m)^m(1+\dfrac1m)^m=(1-\dfrac1{m^2})^m=\sum_{j=0}^m(-1)^j\dfrac{m(m-1)\cdots(m-j+1)}{j!}(\dfrac1m)^{j}(\dfrac1m)^{j},$$ which tends to $1$ as $m\to\infty$. Hence $\lim_{t\to\infty}(1-\dfrac1m)^m=\dfrac1e$. By the definition of the limit, this means that for any real number $\varepsilon\gt 0$, there exist a positive integer $N$ such that $|(1-\dfrac1m)^m-\dfrac1e|\lt\varepsilon$ whenever $m\ge N.$ In other words, if $m$ is a sufficiently large (finite) positive integer, we have $(1-\dfrac1m)^m\approx\dfrac1e$. Now that $m$ is finite, we can associate it to your question by letting $m$ be the size of your vector (taken from the background info you provided for your question).
Therefore, $(1-\dfrac1m)^{kn}=((1-\dfrac1m)^m)^{kn/m}\approx (\dfrac1e)^{kn/m}=e^{-kn/m}.$