I am interested in approximating the negative-hypergeometric distribution. Imagine we have an urn containing $(n+m)$ balls, $n$ of which are considered successes and $m$ failures. We sample from the urn without replacement until we have drawn $r\leq m$ failures. Then, the number of successes $S$ follows a negative hypergeometric distribution with parameters $(n,m,r)$, i.e., it has probability mass function $$P(S = i) = \frac{{{i+r-1}\choose{i}}{{n+m-r-i}\choose{n-i}}}{n+m \choose n},\quad i=0,\ldots,n.$$ Simple caluclations show $\operatorname{\mathbb E} S = \frac{rn}{m+1}$ and $\operatorname{Var}S=\frac{rn(n+m+1)(m-r-1)}{(m+1)^2(m+2)}.$
I am aware that, if $n,m\to\infty$ such that $n/m \to p\in(0,1)$ and $r$ is fixed, then $S$ will converge in distribution to the negative binomial distribution. However, here I am interested in the case, where also $r\to\infty$.
I have read here that, if $n,m,r$ are all large and $\frac{rn}{m+1}\ll n+m$, then one can approximate the negative hypergeometric distribution by a normal distribution with mean and variance as above. In the link provided this is justified by invoking the classical CLT for iid random variables, which is problematic because $B$ is not the sum of iid random variables. The intuition however is that $B$ is under the above conditions "approximately" distributed like the sum of $r$ iid negative hypergeometric distributed random variables with parameters (n,m,1) and then one can rely on the classical CLT for $r\to\infty$.
How would a rigorous justification of the approximation look like?
A related unanswered question is Approximation to the negative-binomial and negative-hypergeometric distributions.