Probaility Distribution of Number of BW/ WB Pairs picked from a sample of Black and White Balls

50 Views Asked by At

I have the following problem.

Assume I have $N$ boxes in a row and that $N_{W}$ of the boxes contain a white ball and $N_{B}$ boxes contain a black ball. Clearly $N_{W} + N_{B} = N$.

I do not know what colour ball is in each box.

I then pick $k$ unique pairs of these boxes. As the total number of unique pairs is $N(N-1)/2$ we have that $k$ is clearly bounded as $ 0 \leq k \leq N(N-1)/2$.

I am interested in working out the statistics of the number $X$ of opposing pairs of balls I draw from the boxes: a pair of balls being opposing if it is WB or BW.

I believe that the expectation value $E_{k}(X)$ will just be given by $kp$ where probability $p = 1-\frac{N_{B}(N_{B}-1) + N_{W}(N_{W}-1)}{N(N-1)}$ is the probabibility of a random pair being BW/WB. This follows from the expectation value being linear in the number of draws.

My first question is on the variance of $X$, i.e. ${\rm Var}_{k}(X) = E_{k}(X^{2})-E_{k}(X)^{2}$. For the first extreme $k =1$ it follows that ${\rm Var}_{1}(X) = p(1-p)$ and for the second extreme $k = N(N-1)/2$ it is clear ${\rm Var}_{N(N-1)/2}(X) = 0$ as we have selected all possible pairs and we will always find that $kp$ of them are opposing and $k(1-p)$ are not.

What is the variance for general $k$? It should be montonically decreasing for $k = 0, ..., N(N-1)/2$ but I cannot work it out. I am also wondering if it is actually a bit more complicated and dependent on how many 'overlapping' vs 'non-overlapping' pairs you draw. A pair being overlapping if they share a box in common.

Secondly, what is the probability distribution $P_{k}(X)$? - with $P_{k}(X)$ being the probability of finding $X$ opposing pairs given $k$. Numerical calculations I have done (I repeat the experiment a significant number of times) for $N_{B} \approx N_{W}$ suggest the distribution $P_{k}(X)$ is well described by a Gaussian for most $k$ but I cannot generally see why this is so (outside of the two extremes).