Here is the problem:
Suppose that there are $k$ people. Each of them independently picks a uniformly random number from the set $\{1, 2,...,n\}$. We say that a collision happens if there exist two people picking the same number. Let $k = \lceil n^\beta\rceil$, where $\beta$ is some constant that does not change with $n$. Prove that there is a constant $\beta_0\in(0,1)$ such that if $\beta>\beta_0$, then a collision happens with probability $1$ when $n\to\infty$; and if $\beta<\beta_0$, then a collision happens with probability $0$ when $n\to\infty$. Also find the value of $\beta_0$.
Here is my attempt:
The sample space is $\Omega=\{1,2,\ldots,n\}^k$. Let $A$ be the event that a collision happens. Then,
$$ \begin{align} \mathbf P(A)&=1-\frac{n(n-1)(n-2)\ldots(n-k+1)}{n^k}\\ &=1-\frac{n-1}n\frac{n-2}n\ldots\frac{n-k+1}n\\ &=1-\left(1-\frac1n\right)\left(1-\frac2n\right)\ldots\left(1-\frac{k-1}n\right)\\ &=1-\prod_{i=1}^{k-1}\left(1-\frac in\right)\\ &\ge1-\prod_{i=1}^{k-1}e^{-\frac in}\\ &=1-e^{-\sum_{i=1}^{k-1}\frac in}\\ &=1-e^{-\frac{k(k-1)}{2n}}\\ &=1-e^{-\frac{k^2}{2n}}e^{\frac k{2n}} \end{align} $$
We used the inequality $1-x\le e^{-x}$. Thus, we have obtained a lower bound of $\mathbf P(A)$. Let $k=n^\beta$. Then, $\mathbf P(A)\ge1-e^{-\frac{n^{2\beta}}{2n}}e^{\frac {n^\beta}{2n}}$. For any $\beta\in(0,1)$, we have $\lim_{n\to\infty}e^\frac {n^\beta}{2n}=1$. If $\beta>\frac12$, then $\lim_{n\to\infty}e^{-\frac{n^{2\beta}}{2n}}=0$, so the lower bound goes to $1$, and $\lim_{n\to\infty}\mathbf P(A)=1$. If $\beta<\frac12$, then $\lim_{n\to\infty}e^{-\frac{n^{2\beta}}{2n}}=1$, so $\lim_{n\to\infty}\mathbf P(A)\ge0$.
I guess that I have to find a upper bound for the probability which goes to zero if $\beta<\frac12$, but I have been unable to find it. Here are my questions:
- Does such an upper bound exist? If yes, how can I find it?
- For simplicity, I used $k=n^\beta$, which is not valid because $k$ must be a positive integer. What happens if I use $k=\lceil n^\beta\rceil$?
- When taking the limit, do we have to rely on the continuity property of probabilities? We cannot define a sequence of events $A_1,A_2,\ldots,A_n$ on the same probability space (although we could if it were $A_1,A_2,\ldots,A_k$).
An upper bound can be obtained using the inequality $1-x\geq 4^{-x}$ for $x\in[0,1/2]$ (an easy way to see this holds is that $4^{-x}$ is convex and equality holds for $x=0, 1/2$). The relevant part of the computation then looks like
$$ \begin{align} \mathbf P(A)&\le1-\prod_{i=1}^{k-1}4^{-\frac in}\\ &=1-4^{-\sum_{i=1}^{k-1}\frac in}\\ &=1-4^{-\frac{k(k-1)}{2n}}\\ &=1-4^{-\frac{k^2}{2n}}4^{\frac k{2n}}. \end{align} $$ The upper bound goes to 0 if $\beta<1/2$.
For the upper and lower bounds to give the desired conclusion, all we need is that as $n\to\infty$, we have $\frac{k^2}{2n}\to 0$ if $\beta<1/2$ and $\frac{k^2}{2n}\to \infty$ if $\beta>1/2$. These limits still hold for $k=\lceil n^{\beta}\rceil$.