Bounding ratio of joint and marginal pmfs for urn model

86 Views Asked by At

Setup

Suppose I have $N=mn$ balls in an urn, with $N_1$ red and $N_0$ black ($N=N_1+N_0$). Now suppose that I perform $m$ draws of $n$ balls ($n\ll N$), each time without replacement.

Let $Y_1,\dots,Y_m$ be a sequence of random variables representing the number of red balls for $m$ draws, and $y_1,\dots,y_m$ be their respective observations. The joint probability of observing $y_1,\dots,y_m$ is given by:

$$ P(Y_1=y_1,\dots,Y_m=y_m)=\frac{\binom{N_1}{y_1,\dots,y_m}\binom{N_0}{n-y_1,\dots,n-y_m}}{\binom{N}{\underbrace{n,\dots,n}_\text{$m$ times}}} $$

Compare this to the product of the marginal probabilities:

$$ \prod_{k=1}^m P(Y_k=y_k)=\prod_{k=1}^m\frac{\binom{N_1}{y_k}\binom{N_0}{n-y_k}}{\binom{N}{n}} $$

Claim

Now I claim that for sufficiently large $N$ and some relationship between $n$ and $N$, the quantity

$$ \frac{P(Y_1=y_1,\dots,Y_m=y_m)}{\prod_{k=1}^m P(Y_k=y_k)} $$

can be made arbitrarily close to 1. When the number of draws $m$ is large and total number of balls in the urn, $N$ is also very large, a single draw of $n\ll N$ balls is unlikely to affect the probability of getting a red ball in any other combination of $n$ draws. Hence, I claim that the joint can be approximated by the product of marginal probabilities in the asymptotic sense. Roughly speaking, the random variables $Y_1,\dots,Y_m$ are "asymptotically independent".

Question

Are there results in the literature of urn models that look at this type of problem? I worked out the algebra and found that I can bound the ratio as:

$$ \frac{P(Y_1=y_1,\dots,Y_m=y_m)}{\prod_{k=1}^m P(Y_k=y_k)}\leq \left(\frac{N!}{(N-n)!}\right)^{m-1}\frac{N_1!N_0!}{(N-n)!} $$

but this bound is not tight and goes to infinity (as $n,m,N\to\infty$), for the case when $n\approx N^{1/3}$ (also note that $m=N/n$). Is convergence of this ratio even possible?

1

There are 1 best solutions below

0
On

Not exactly an answer / too long for a comment.

I don't know of any literature. However, based on your intuitive reasoning for the $n \ll N$ case, it would seem if the ratio $\approx 1$ at all, it should happen when $n=1, m=N$, right? But it doesn't. This casts doubt on your original claim that the ratio should be close to $1$.

For the $n=1$ case, exactly $N_1$ of the $y_k$'s equal $1$ and the rest equal $0$. So the joint probability is

$$P(Y_1 = y_1, \dots, Y_N = y_N) = {1 \over {N \choose N_1}} = {N_1! ~ N_0! \over N!}$$

Meanwhile the product of the marginals is

$$\prod P(Y_k = y_k) = ({N_1 \over N})^{N_1} \times ({N_0 \over N})^{N_0} = {N_1^{N_1} ~ N_0^{N_0} \over N^N}$$

Now Stirling's approximation is

$$N! \approx \sqrt{2 \pi N} ({N \over e})^N \iff {N! \over N^N} \approx {\sqrt{2 \pi N} \over e^N}$$

So your ratio becomes

$${P(Y_k = y_k \forall k) \over \prod P(Y_k = y_k)} = {\sqrt{2 \pi N_1} \sqrt{2 \pi N_0} \over \sqrt{2 \pi N}} = \sqrt{2 \pi {N_1 N_0 \over N}}$$

which is not equal to $1$. In fact, if $N \to \infty$ while the fraction of red balls $f_{red} = {N_1 \over N}$ is held constant, the RHS $\to \infty$ also.


Further thoughts: you said

When the number of draws $m$ is large and total number of balls in the urn, $N$ is also very large, a single draw of $n \ll N$ balls is unlikely to affect the probability of getting a red ball in any other combination of $n$ draws.

This is of course true, but this isn't what the ratio is about. The ratio includes not just one draw, but all $m$ draws. So while one draw's effect on another is negligible (if $n \ll N$), that does not imply all the draws' effects on each other are still negligible.