Let $D$ be the amount of distinct numbers in a sequence of $k$ random numbers in range $[1,\ldots,n]$ (n>k). I want to show that: $D=\Omega(k)$ with exponential high probability. I'm interested in the case where $k/n \rightarrow 0$, e.g., $k=\sqrt{n}$.
I tried the following approach:
$D$ is the amount of nonempty bins when throwing $k$ balls into $n$ bins. $Y$ is the amount of empty bins when throwing $k$ balls into $n$ bins. So, $D=n-Y$.
Now, using Chernoff bound and the fact that $\text{E}[Y]=ne^{\frac{-k}{n}}$:
$$\begin{align}\Pr(Y>(1+\delta)\text{E}[Y])\le 2^{-\delta \text{E}[Y]}\\ \Pr(D<n-(1+\delta)\text{E}[Y])\le 2^{-\delta \text{E}[Y]}\end{align}$$
where $\delta>2e-1$.
I want to find $\Pr(D<\beta k)$, but to do so I have to take $\delta=\frac{n-\beta k-\text{E}[Y]}{\text{E}[Y]}$ which is negative since $\text{E}[Y]\rightarrow n$.
What I'm doing wrong, or there is some other way to show that $D=\Omega(k)$ with exponential high probability? Or maybe this statement is wrong?
Thanks!
Not quite...
Edit: One can control the distribution of $D$ as follows. Consider the event $[D\leqslant k-i]$ and choose the $k$ numbers sequentially. Then $D\leqslant k-i$ means that at least $i$ times, one chooses an already chosen number. This happens each time with probability $\leqslant\frac{k}n$ and there are $\binom{k}i\leqslant2^k$ ways of choosing these $i$ moments, hence $\mathbb P(D\leqslant k-i)\leqslant2^k\left(\frac{k}n\right)^i$ $(\ast)$.
For example, $\mathbb P(D\leqslant\frac12k)\leqslant\mathrm e^{-\Lambda(n,k)}$ with $\Lambda(n,k)=-\frac12k\log\left(\frac{k}n\right)-k\log2$. One sees that, if $1\ll k\ll n$, then $\Lambda(n,k)\gg k$ hence the control on $\mathbb P(D\leqslant\frac12k)$ is at least exponential in $k$.
Edit 2: To see that $(\ast)$ holds, for every subset $I$ of $\{1,2,\ldots,k\}$ of size $i$, call $A_I$ the event that at every time in $I$ one chooses an already chosen number and at every time not in $I$ but earlier than some element in $I$ one chooses a new number (at every time not in $I$ later than the whole set $I$, the choice is free). Then the events $A_I$ are disjoint and their union over every $I$ of size $i$ included in $\{1,2,\ldots,k\}$ is $[D\leqslant k-i]$. There are exactly ${k\choose i}\leqslant2^k$ such sets $I$, and for every such set $I$, $\mathbb P(A_I)\leqslant\left(\frac{k}n\right)^i$. The upper bound of $\mathbb P(D\leqslant k-i)$ follows.