Sampling into two sets

100 Views Asked by At

Let $N$ be the set of integers $1,\cdots,n$ and let $A$ be a set of numbers sampled independently from $N$ such that each element of $N$ has probability $p=0.5$ to be selected. I am trying to answer these progressively more difficult questions.

(1) What is the expected value of the size of $A$?

  • This is easy - the size has a binomial distribution $B(n,p)$ and its expectation is $np = \frac{n}{2}$.

(2) What is the expected value of the smallest value in $A$?

  • We can look at the sampling process as a sequence of Bernoulli trials, starting at 1. The smallest number of $A$ is exactly the number of trials needed to get one success. It has a geometric distribution and its expectation is $\frac{1}{p} = 2$.

(3) Order the elements of $A$ in increasing order and remove the first (smallest) $r-1$ elements. What is the expected value of the smallest remaining element?

  • This is, I think, exactly the number of trials needed to get $r$ successes. If we remove $r$ we get the number of failures, which has a negative binomian distribution $NB(r,p)$ and its mean is $\frac{(1-p)r}{p} = r$. Add $r$ again and get that the answer is $2r$.

(4) Suppose we now independently sample another set $B$ in the same way. Suppose w.l.o.g. that $B$ is the smaller set. Order the elements of $A$ in an increasing order and remove the first $|A|-|B|$ elements (such that the sets now have the same size). What is the expected value of the smallest element remaining in $A$?

Here I am stuck because of two reasons:

  • I don't know how to calculate the expected value of $r=|A|-|B|$. Note that since we assume that $A$ is the larger set, $r$ is actually the absolute value of thje difference. This seems to be related to: Difference of two binomial random variables there are formulas for general cases, but how can I the expectation?
  • Even if I can calculate the expected value of $r$, is it correct to just use it in the formula of (3)?