Consider an urn with $n$ balls. Draw a ball from the urn and put it back. Repeat this until you draw a ball that you have drawn before. Let $E(n)$ denote the expected number of draws that result from this process. I have convinced myself by writing a program that $E(n)\in\Theta(√n)$ however I would like a formal proof of this. Also I would like to know the constant coefficient $c$ such that $E(n) \sim c√n$. Please note that my question is related to this question here in that it considers the same underlying problem however my question is specifically interested in a proof of the growth rate as well as the coefficient.
Growth rate of the expected number of draws with replacement from an urn with $n$ balls needed before you draw the same ball twice
113 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
Perhaps the following should really be a comment, but it's getting a bit long for that.
The answer by bof in the linked-to question Expected number of draws until a repeat draw shows $E(n) = 1 + Q(n)$, where $$Q(n) = \sum_{1 \le k \le n} \frac{n!}{(n-k)! n^k}$$
$Q(n)$ is known in probability theory as the "Ramanujan Q function" or the "birthday function".
Theorem 4.8 in An Introduction to the Analysis of Algorithms, Second Edition by Robert Sedgewick and Philippe Flajolet shows $$Q(n) = \sqrt{\pi n /2} + O(1)$$ so $$E(n) = 1 + Q(n) = \sqrt{\pi n /2} + O(1)$$
EDIT
An alternative reference is The Art of Computer Programming, Second Edition, Volume 1: Fundamental Algorithms by Donald Knuth,section 1.2.11.3, which gives the more precise estimate $$Q(n) = \sqrt{\frac{\pi n}{2}} - \frac{1}{3} + \frac{1}{12} \sqrt{\frac{\pi}{2n}} - \frac{4}{135n} + \frac{1}{288}\sqrt{\frac{\pi}{2n^3}} + O(n^{-2}) $$
Even though the accepted answer is probably perfectly satisfying, let me just give some self-contained proof for future reference. I'm not sure how Flajolet and Sedgewick prove it, but I claim it is a pretty standard application of the Laplace method. First note that:
$$\sum_{k=0}^n \frac{n(n-1)\ldots(n-k+1)}{n^k}=\sum_{k=0}^n \binom{n}{k}\frac{1}{n^k}\int_0^\infty u^k e^{-u} du=\int_0^\infty (1+u/n)^n e^{-u}du$$
To get at the right scaling, let $t=u/\sqrt{n}$, so that:
$$\frac{E(n)}{\sqrt{n}}=\int_0^\infty \exp\left(n\log\left(1+\frac{t}{\sqrt{n}}\right)-t\sqrt{n}\right)dt$$
Now:
$$\int_0^{n^{1/100}} e^{-t^2/2} dt=\sqrt{\pi/2}+O(e^{-(n^{1/100})^2/2})$$
Hence $E(n)\sim \sqrt{n \pi/2}$