Convergence of Hypergeometric Distribution to Binomial

859 Views Asked by At

Assuming that we have a Hypergeometric distribution:

$$f(x | N, M, K) = \frac{{M \choose x}{N - M \choose K- x}}{{N \choose K}}$$

where $x \in \{0, 1, ..., M\}$ and assume $N, M \rightarrow \infty, \frac{M}{N} \rightarrow p$, then

$f(x | N, M, K) \rightarrow {K \choose x}p^{x}(1 - p)^{K - x}$

In short, as the population $N$ and targets $M$ grow, Hypergeometric converges to Binomial(K, p).

The book suggests solving this by utilising Stirling approximation. I have no question about that. By rewriting the binomial coefficients:

$$\frac{{M \choose x}{N - M \choose K- x}}{{N \choose K}} = \frac{M!(N-M)!K!(N-K)!}{x!(M-x)!(N-M-K+x)!(K-x)!N!} \sim \frac{\sqrt{2\pi}M^{M+1/2}e^{-M}...\sqrt{2\pi}(N-K)^{N-K+1/2}e^{K-N}}{\sqrt{2\pi}x^{x + 1/2}e^{-x}...\sqrt{2\pi}N^{N+1/2}e^{-N}}$$

Then using some algebra we take the limit of this, and arrive at the desired result.

BUT

Is it actually justified? I feel like the author (Casella & Berger, exercise 3.11) intends this to be "the way" to solve this task, hence the "hint" about Stirling formula.

But from googling, Stirling formula is a relative approximation! In short, it is $$\lim\limits_{N \rightarrow \infty}\frac{N!}{\sqrt{2\pi}N^{N + 1/2}e^{-N}} = 1$$

In fact, the factorial in the numerator and Stirling function might not "converge" at all in a traditional sense of this word: there is no $\epsilon$, s.t. $|N! - \text{Stirling}(N)| < \epsilon$ for all $N > \text{some number}$.

Thus, even though I establish the limit for the "Stirling"-representation of the Hypergeometric distribution, I actually cannot establish the transitive relation to the original formula in terms of factorials. Thus, I can find $\epsilon > 0$, s.t. "Stirling representation" is as close to Binomial as I want pointwise, I cannot find any $\epsilon_2$, s.t. Hypergeometric is arbitrarily close to its Stirling-representation pointwise.

Is my logic correct?

If it is, is there a way to show this differently? Because it would mean that the derivation of convergence is incorrect.

2

There are 2 best solutions below

3
On BEST ANSWER

Suppose you have the following relative approximations $\frac{f_1(n)}{g_1(n)} \to 1$ and $\frac{f_2(n)}{g_2(n)} \to 1$.

Then you can use this to help compute the limit of $f_1/f_2$ via $$\lim_{n \to \infty} \frac{f_1(n)}{f_2(n)} = \left(\lim_{n \to \infty} \frac{f_1(n)}{f_2(n)}\right) \frac{\lim_{n \to \infty} g_1(n)/f_1(n)}{\lim_{n \to \infty} g_2(n)/f_2(n)} = \lim_{n \to \infty} \frac{f_1(n)}{f_2(n)} \frac{g_1(n)/f_1(n)}{g_2(n)/f_2(n)} = \lim_{n \to \infty} \frac{g_1(n)}{g_2(n)}.$$

This is what is being hidden in the "$\sim$" step where the factorials are replaced using Stirling's approximation.


Response to comment:

My answer is not talking about $\frac{\text{Hypergeometric PMF}}{\text{Binomial PMF}} \to 1$. You are correct that the desired result is $\text{Hypergeometric PMF} \to \text{Binomial PMF}$.

My answer is talking about why you can replace the factorials in $\frac{M! (N-M)! \cdots}{x! \cdots}$ individually with Stirling's approximation (the "$\sim$" step that your question is asking about).

Specifically, the authors really are doing \begin{align} \lim_{M,N \to \infty} \frac{{M \choose x}{N - M \choose K- x}}{{N \choose K}} &= \lim_{M,N \to \infty} \frac{M!(N-M)!K!(N-K)!}{x!(M-x)!(N-M-K+x)!(K-x)!N!} \\ &= \lim_{M,N \to \infty} \frac{\sqrt{2\pi}M^{M+1/2}e^{-M}...\sqrt{2\pi}(N-K)^{N-K+1/2}e^{K-N}}{\sqrt{2\pi}x^{x + 1/2}e^{-x}...\sqrt{2\pi}N^{N+1/2}e^{-N}} \end{align} and the my answer above is justifying the last equality. Dropping the limits and using "$\sim$" is a common shorthand.

0
On

You ask for a way to show this differently.

I leave as an exercise for the reader the fussy correction for $M = [nP-1/2,nP+1/2) \cap \Bbb{Z}$, that is letting $M$ always be the integer closest to $pN$ as $N$ varies. (Alternatively, replace all the factorials with Gammas and let $M$ and $N$ range smoothly through real numbers instead of through integers.) Here, we proceed by approximating $M$ with $nP$ immediately.

You want \begin{align*} L &= \lim_{N \rightarrow \infty} \frac{(pN)!(N-pN)!K!(N-K)!}{x! (pN - x)! (K-x)!(N-pN-K+x)!N! } \\ &= \frac{K!}{x! (K-x)!} \cdot \lim_{N \rightarrow \infty} \frac{(N-K)!}{K!} \cdot \frac{(pN)!}{(pN-x)!} \\ &\qquad \cdot \frac{(N-pN)!}{(N-pN-K+x)!} \text{.} \end{align*}

This really is just an application of "factor out the big part". There are three cases. I show the $x > K$ case. The $x = K$ case is similar, but a little easier since we have $\frac{1}{N^0(1-p)^0}$ in the second step. The $x < K$ case is analogous to the one shown. So, suppose $x > K$ ...

\begin{align*} L &= \binom{K}{x} \lim_{N \rightarrow \infty} \frac{1}{N(N-1)(N-2) \cdots (N-K+1)} \\ &\qquad \cdot \frac{(pN)(pN-1) \cdots (pN-x+1)}{1} \\ &\qquad \cdot \frac{1}{(N-pN-K+x) \cdots (N-pN+1)} \\ &= \binom{K}{x} \lim_{N \rightarrow \infty} \frac{1}{N^K}\frac{1}{1 \left( 1-\frac{1}{N} \right) \left( 1-\frac{2}{N} \right) \cdots \left( 1- \frac{K-1}{N} \right)} \\ &\qquad \cdot p^xN^x \frac{(1) \left( 1- \frac{1}{pN} \right) \cdots \left( 1-\frac{x-1}{pN} \right)}{1} \\ &\qquad \cdot \frac{1}{N^{x-K}(1-p)^{x-K}}\frac{1}{\left(1+ \frac{-K+x}{N-pN} \right) \cdots \left( 1+ \frac{1}{N-pN} \right)} \\ &= \binom{K}{x} p^x \frac{1}{(1-p)^{x-K}} \\ &\qquad \cdot \lim_{N \rightarrow \infty} \frac{1}{1 \left( 1-\frac{1}{N} \right) \left( 1-\frac{2}{N} \right) \cdots \left( 1- \frac{K-1}{N} \right)} \\ &\qquad \cdot \frac{(1) \left( 1- \frac{1}{pN} \right) \cdots \left( 1-\frac{x-1}{pN} \right)}{1} \\ &\qquad \cdot \frac{1}{\left(1+ \frac{-K+x}{N-pN} \right) \cdots \left( 1+ \frac{1}{N-pN} \right)} \\ &= \binom{K}{x} p^x (1-p)^{K-x} \\ &\qquad \cdot \lim_{N \rightarrow \infty} \frac{1}{1 \left( 1-\frac{1}{N} \right) \left( 1-\frac{2}{N} \right) \cdots \left( 1- \frac{K-1}{N} \right)} \\ &\qquad \cdot \lim_{N \rightarrow \infty} \frac{(1) \left( 1- \frac{1}{pN} \right) \cdots \left( 1-\frac{x-1}{pN} \right)}{1} \\ &\qquad \cdot \lim_{N \rightarrow \infty} \frac{1}{\left(1+ \frac{-K+x}{N-pN} \right) \cdots \left( 1+ \frac{1}{N-pN} \right)} \\ &= \binom{K}{x} p^x (1-p)^{K-x} \cdot 1 \cdot 1 \cdot 1 \text{.} \end{align*}

(For evaluating the limits involving products of terms of the form $\displaystyle \left( 1 \pm \frac{A(K,p,x)}{N} \right)$, multiply them out completely, obtaining $1 \pm ($ terms involving $K,p,x$ and various negative powers of $N)$. Of course as $N \rightarrow \infty$, the terms involving negative powers of $N$ go to $0$, leaving only the leading $1$.