Proving a combinatorial identity without double counting

148 Views Asked by At

Consider the following problem:

We have a collection of m white and n black balls. We pick k balls (km+n) without replacement. What is the expected number of white balls in the sample?

Counting method 1: The ratio of white balls in the collection is m/(m+n). If we repeatedly sample the collection, we expect that on average, our sample has the same ratio. Therefore, the expected number of white balls in a sample of size k is km/(m+n).

As a side note, I think the above argument somehow depends on the unbiasedness of 
the mean estimator. Am I right?

Counting method 2: We directly compute the expected number of white balls in the sample:

$$ \frac{\sum_{i=0}^{k} i\binom{m}{i}\binom{n}{k-i}}{\binom{m+n}{k}} \enspace.$$


So, we reach at the identity: $$ \sum_{i=0}^{k} i\binom{m}{i}\binom{n}{k-i} = \frac{km}{m+n}\binom{m+n}{k} \enspace.$$

How to prove the above result without double counting?

It seems to be provable using Vandermonde's identity as well as $i\binom{m}{i} = m\binom{m-1}{i-1}$. But is there a simpler approach (which does not depend on extra identities such as Vandermonde's)?

2

There are 2 best solutions below

1
On

If we are looking to evaluate

$$\sum_{q=0}^k q{m\choose q}{n\choose k-q} = \sum_{q=1}^k q{m\choose q}{n\choose k-q} = m \sum_{q=1}^k {m-1\choose q-1}{n\choose k-q}$$

there is a very simple proof using the integral representation

$${n\choose k-q} = \frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k-q+1}} (1+z)^n \; dz.$$

This is zero when $q\gt k$ so we may extend the sum to infinity to get

$$\frac{m}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^n \sum_{q\ge 1} {m-1\choose q-1} z^q \; dz \\ = \frac{m}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^n z (1+z)^{m-1} \; dz \\ = \frac{m}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k}} (1+z)^{m+n-1} \; dz.$$

This is

$$m{m+n-1\choose k-1} = \frac{km}{m+n} {m+n\choose k}.$$

3
On

For $i=1$ to $k$, define the indicator random variable $X_i$ by $X_i=1$ if the $i$-th ball picked is white, and by $X_i=0$ otherwise. Then the number $W$ of white balls picked is given by $$W=X_1+X_2+\cdots+X_k.$$ By the linearity of expectation, we have $$E(W)=E(X_1)+E(X_2)+\cdots+E(X_k).\tag{1}$$ For any $i$, the probability that the $i$-th ball picked is white is $\frac{m}{m+n}$. This is because any ball is just as likely as any other to be the obe picked $i$-th.

Thus $\Pr(X_i=1)=\frac{m}{n}$, and therefore $E(X_i)=\frac{m}{m+n}$. It follows from (1) that $E(W)=k\cdot \frac{m}{m+n}$.