Let X be the smallest value obtained when k numbers are randomly chosen from the set 1,...,n. Find E[X] by interpreting X as a negative hypergeometric random variable.
This is Self Test Exercise 7.7 of Sheldon's A First Course in Probability.
The way I approached this is to first consider (for example) P(X = 1). Using the suggestion to take X as a negative hypergeometric random variable, we have
$$P(X = 1) = \frac{\binom{1}{1}\binom{n - 1}{k - 1}}{\binom{n}{k}}$$
My idea was that we have one element 1 and the rest $k - 1$ elements from ... the remaining $n - 1$ elements. Similarly, for $P(X = 2)$, we have one way of getting the minimum element $2$, and $\binom{n - 2}{k - 1}$ ways of getting the other elements (well, except 1).
Similarly, we can deduce that
$$P(X = i) = \frac{\binom{n - i}{k - 1}}{\binom{n}{k}}$$
As a result, I got
$$E(X) = \sum_{i = 1}^n i \times \frac{\binom{n - i}{k - 1}}{\binom{n}{k}}$$
The answer in the book is $\frac{n + 1}{k + 1}$, which is just so much more elegant than what I came up with, and I'm not seeing how my answer reduces to theirs.
My questions are:
- Is my approach correct? If not, what mistake(s) did I make in my analysis?
- If my answer is correct, how do I get to the expected result?
An easier way is to use the complementary CDF: $$P(X > i)=1-F_X(x),$$ to compute the expectation as follows:
$$\mathbb E(X) = \sum_{i = 0}^{n-k}P(X > i),$$
which can be used for any non-negative random variable $X$; see here for more details. Indeed,
$$P(X >i) = \frac{\binom{n - i}{k}}{\binom{n}{k}}.$$
Hence,
$$\mathbb E(X) =\sum_{i = 0}^{n-k}\frac{\binom{n - i}{k}}{\binom{n}{k}}=\frac{\binom{n +1}{k+1}}{\binom{n}{k}}=\frac{n +1}{k+1}.$$
To compute the summation, I used the hockey-stick identity.
Direct method:
You could directly calculate the expectation using the PMF, $P(X= i)$:
$$\mathbb E(X) = \sum_{i = 1}^{n-k+1} i \times P(X = i)=\sum_{i = 1}^{n-k+1} i \times \frac{\binom{n - i}{k-1}}{\binom{n}{k}}.$$
To manage this, use the following
$$\sum_{i = 1}^{n-k+1} i \times \binom{n - i}{k-1}=(n+1)\sum_{i = 1}^{n-k+1} \binom{n - i}{k-1} - \sum_{i = 1}^{n-k+1}(n+1-i) \times \binom{n - i}{k-1}$$
and then use the hockey-stick identity for each summation, but after applying the following to the second:
$$(n+1-i) \times \binom{n - i}{k-1}=k \times \binom{n - i+1}{k}.$$