I am familiar with the Negative Binomial distribution $NB(p, k)$, which gives the number of failures before $k$ successes occur in a Bernoulli process with parameter $p$. I am wondering, however, if there the distribution of the number of failures before getting $k$ distinct successes is a well-known distribution.
More precisely, suppose we have some set of $N$ elements; without loss of generality, we may assume this set is $[N] = \{1, \cdots, N\}$. We have a subset $S \subseteq [N]$ which contains the successes, while every element in $[N] \setminus S$ is a failure. Now in our process, we select a random element uniformly at random from $N$, and keep on doing this until we have attained some $k \leq |S|$ successes. Let $X$ be the total number of trials performed before we stop the process. What is the distribution of $X$, in terms of $k$, N, and p = $|S| / N$? If we can't concisely describe this distribution, can we at least come up with some (relatively tight) tail bounds? And how valid is the approximation $X \approx NB(p, k)$?
I know this problem can simplify to the coupon collector's problem in the case that $S = [N]$ and $k = N$, but I'm more interested in the case where $0 < p < 1$ and $k \ll |S| = pN$.
$X$ can be written as a sum of $k$ independent geometric variables $X=\sum_{i=0}^{k-1} G_i$ where $G_i$ is geometric with parameter $\frac{|S|-i}{N-i}$. This representation allows to easily calculate the mean, variance and moment generating function of $X$. In particular, for $k$ large, $X$ is quite concentrated near its mean. If you subtract the mean and normalize by the standard deviation, it is asymptotically Gaussian as $k \to \infty$. If $k=o(N)$ as $N \to \infty$ then the law of $X$ is well approximated by a negative binomial law.