Approximation of minimum among many binomials

299 Views Asked by At

We choose $k$ numbers independently from the binomial distribution $B(n,1/2)$, where we can think of $n$ as large. What is the expectation of the minimum of the $k$ numbers? Is there a good way to approximate?

(We can compute the exact probability that all $k$ numbers are below some constant $c$, but that involves a lot of binomial terms.)

2

There are 2 best solutions below

5
On BEST ANSWER

The following should at least partly answer your question.

Let $X_1,\ldots,X_k$ be random variables i.i.d according to $B(n,\frac{1}{2})$. Let $$Y_k=min\{X_1,\ldots,X_k\}$$ The following is valid:

\begin{align*} E(Y_k) &= \frac{1}{2^{nk}}\sum_{t=1}^{n}\left(\sum_{j=t}^{n}\binom{n}{j}\right)^k\tag{1}\\ \\ E(Y_k) &\sim \left(1-\frac{1}{2^n}\right)^k \qquad\qquad \text{for }k\text{ large}\tag{2}\\ \end{align*}

As far as I know there is no closed formula for (1) but we can show that the asymptotic behaviour for large $k$ is according to (2).

Step 1:

In order to show (1) we note that if $X \sim B(n,p)$ we get \begin{align*} P(X=j)=\binom{n}{j}\left(\frac{1}{2}\right)^j\left(\frac{1}{2}\right)^{n-j}=\binom{n}{j}\left(\frac{1}{2}\right)^{n}\tag{3} \end{align*}

Next we observe:

The minimum random variable $Y_k$:

  • $Y_k=0$ iff at least one of the $k$ random variables $X_1,\ldots,X_k$ is $0$, or equivalently not all $X_l\geq 1 \quad (1\leq l \leq k)$.

  • $Y_k=1$ iff all random variables $X_l \geq 1 $ and not all $X_l \geq 2$.

Proceeding this way, we get:

\begin{align*} P(Y_k=0)&=1-P(Y_k\geq 1) = 1-P(X_1\geq 1)^k\\ P(Y_k=1)&=P(X_1\geq 1)^k-P(X_1\geq 2)^k\\ P(Y_k=2)&=P(X_1\geq 2)^k-P(X_1\geq 3)^k\\ &\ldots\\ P(Y_k=n)&=P(X_1\geq n)^k \end{align*}

From these equations we can deduce the expectation $E(Y_k)$:

\begin{align*} E(Y_k)&=\sum_{t=0}^ntP(Y_k=t)\\ &=\sum_{t=1}^{n-1}t\left(P(X_1\geq t)^k-P(X_1\geq t+1)^k\right)+nP(X_1\geq n)^k\\ &=\sum_{t=1}^{n}tP(X_1\geq t)^k-\sum_{t=1}^{n-1}tP(X_1\geq t+1)^k\\ &=\sum_{t=1}^{n}tP(X_1\geq t)^k-\sum_{t=2}^{n}(t-1)P(X_1\geq t)^k\\ &=\sum_{t=1}^{n}P(X_1\geq t)^k\tag{4}\\ &=\sum_{t=1}^n\left(\sum_{j=t}^{n}\frac{1}{2^n}\binom{n}{j}\right)^k\\ &=\frac{1}{2^{nk}}\sum_{t=1}^n\left(\sum_{j=t}^{n}\binom{n}{j}\right)^k \end{align*}

which shows (1).

Step 2:

In order to show the asymptotic behaviour of $Y_k$ for large $k$, we first note that for $j\geq 2$:

$$0\leq P(X_1\geq j) \leq P(X_i\geq 2)$$

We also see according to (3):

\begin{align*} P(X_1 \geq t)&=\frac{1}{2^n}\sum_{j=t}^{n}\binom{n}{j}\\ P(X_1\geq 1)&=\frac{1}{2^n}\left(2^n-1\right)=1-\frac{1}{2^n}\\ P(X_1\geq 2)&=\frac{1}{2^n}\left(2^n-1-n\right)=1-\frac{1}{2^n}-\frac{n}{2^n}\tag{5}\\ &=\left(1-\frac{1}{2^n}\right)\left(1-\frac{n}{2^n-1}\right)\\ \end{align*}

Therefore we get according to (4) and (5)

\begin{align*} P(X_1\geq 1)^k\leq &\sum_{t=1}^{n}P(X_1\geq t)^k\leq P(X_1\geq 1)^k+(n-1)P(X_1 \geq 2)^k\\ \end{align*}

which can be written as

\begin{align*} \left(1-\frac{1}{2^n}\right)^k\leq E(Y_k)&\leq\left(1-\frac{1}{2^n}\right)^k+(n-1)\left(1-\frac{1}{2^n}\right)\left(1-\frac{n}{2^n-1}\right)^k\\ &\leq\left(1-\frac{1}{2^n}\right)^k\left(1+(n-1)\left(1-\frac{n}{2^n-1}\right)^k\right)\\ \end{align*}

It follows

\begin{align*} 1\leq \frac{E(Y_k)}{\left(1-\frac{1}{2^n}\right)^k}\leq1+(n-1)\left(1-\frac{n}{2^n-1}\right)^k\\ \end{align*}

We conclude

\begin{align*} \lim_{k\rightarrow\infty}\frac{E(Y_k)}{\left(1-\frac{1}{2^n}\right)^k}=1 \end{align*} and (2) follows.


Note:

  • (1) is just an adaptation of the answer from @AndréNicolas to this question and

  • The essence for (2) comes from @Did who answered this question

0
On

Just a hint:

Let $X\sim B(n,1/2)$.

  • Define $Y=-X$.
  • Write the cdf of $Y$. At this point the normal approximation can be used.
  • Apply the method described here for maximum of $k$ random trials of $Y$ to get the corresponding cdf.
  • Differentiate the cdf to get pdf for maximum of $k$ trials of $Y$.
  • Transform this pdf to pdf of minimum of $k$ trials of $X$.
  • Calculate the expectated value.