Classical Occupancy Problem. There are $n$ distinct labeled balls in an urn. $k$ of them of uniformly selected with replacement. What is the probability that the sample contains at least one ball of each kind.
Let $(M_1,\ldots, M_n)$ be the vector of counts of each kind in the sample. The vector $(M_1,\ldots, M_n)$ follows multinomial distribution $\operatorname{Mult}\left(k, \{\frac{1}{n},\ldots, \frac{1}{n}\}\right)$, and the probability is $$ p_{k,n} = \Pr(M_1 > 0, \ldots, M_n >0) \tag{1} $$ It can be explicitly computed either using inclusion-exclusion principle, or by building a recurrence relation for $p_{k,n}$: $$ p_{k,n} = \sum_{m=0}^n (-1)^m \binom{n}{m} \left(1-\frac{m}{n}\right)^k = \frac{n!}{n^k} \mathcal{S}_2\left(k, n\right) \tag{2} $$ where $\mathcal{S}_2\left(k, n\right)$ denotes Stirling number of the second kind.
See Brian Gladman's notebook from mersseneforum.org or section 5.6 of Problems and snapshots from the world of probability for more details.
For $k \ggg n$, the probability $p_{k,n}$ is very close to 1 (see DMLF for asymptotic behavior of the Stirling number). I am interested in obtaining more precise asymptotic expansion, perhaps by using CLT to evaluate $(1)$.
References or explicit solutions are appreciated.
First: the statement is more usually formulated as throwing $k$ balls into $n$ urns, and computing the probability that no urn is empty.
An asymptotic can be obtained by a Poissonization argument: for large $n,k$ the process is asympotically equivalent to throwing $n$ iid Poisson variables with $\lambda = k/n$ [*]. The probability that all are strictly positive is, in this approximation:
$$P(n,k) \approx (1- e^{-\lambda})^n =\left(1-\exp\left(-\frac{k}{n}\right)\right)^n$$
[*] More precisely: if $X=\{x_1, x_2 \cdots x_n\}$ is a symmetric multinomial with $\sum x_i=k$, and $Y=\{y_1, y_2 \cdots y_n\}$ is iid Poisson with $E(y_i)=\lambda = k/n$, then $P(Y | \sum y_i = k ) =P(X)$
Some values of $1-P$
The additional column (approx2) is a second order approximation, with a CLT-like correction to the above formula:
$$P(Y>0 \mid \sum y=k)=\frac{P(\sum y_i=k\mid Y>0)}{P(\sum y_i=k)} P(Y>0)$$
The first approximation assumes the fraction is one (because of the law of large numbers, one expects that the conditioning is asympotically irrelevant), while the second approximation computes the fraction using (a zero order approximation) for the CLT, with a Poisson in the denominator and a truncated Poisson in the numerator. Higher order approximations could be obtained via Edgeworth expansions, but they would quickly become clumsy, as higher order moments of a truncated Poisson are involved.