A Problem Dealing with putting balls in bin and Expected Value - possible wrong answer

447 Views Asked by At

Please consider the problem below. Is my answer correct. If is not correct then where did I go wrong?
Problem:
You keep tossing balls into $n$ bins until one of the bins has two balls. For each toss there is a $\frac{1}{n}$ probability that the ball you toss lands in any one of the bins. What is the expected number of tosses?
Answer:
Let $p_i$ be the probability that after $i$ tosses we have at least one bin with two balls. \begin{eqnarray*} p_1 &=& 0 \\ p_2 &=& 1 - ( 1 - \frac{1}{n} ) = \frac{1}{n} \\ p_3 &=& 1 - (1 - \frac{1}{n})(1 - \frac{1}{n-1}) \\ p_3 &=& 1 - (\frac{n-1}{n})(\frac{n-1-1}{n-1}) \\ p_3 &=& 1 - (\frac{n-2}{n}) = \frac{2}{n} \\ p_4 &=& 1 - (1 - \frac{1}{n})(1 - \frac{1}{n-1})(1 - \frac{1}{n-2}) \\ p_4 &=& 1 - ( \frac{n-1}{n} )( \frac{n-2}{n-1} )( \frac{n - 2 -1}{n - 2} ) \\ p_4 &=& 1 - \frac{n-3}{n} = \frac{3}{n} \\ \end{eqnarray*}
Now for $1 <= i <= n$ we have: $p_i = \frac{i-1}{n}$. \begin{eqnarray*} E &=& 2p_2 + 3p_3 + 4p_4 + ... (n+1)p_{n+1} \\ E &=& \sum_{i = 1}^{n} \frac{i(i+1)}{n} = \frac{1}{2n} \sum_{i=1}^{n} i^2 + i \\ E &=& \frac{1}{2n}(\frac{n(n+1)(2n+1)}{6} + \frac{n(n+1)}{2} ) \\ E &=& \frac{n+1}{4n} ( \frac{2n+1}{3} + 1 ) \\ \end{eqnarray*}

Here is an update to my answer:

Let $p_i$ be the probability that after $i$ tosses we have at least one bin with two balls. \newline \begin{eqnarray*} p_1 &=& 0 \\ p_2 &=& 1 - ( 1 - \frac{1}{n} ) = \frac{1}{n} \\ p_3 &=& 1 - (\frac{n-1}{n})( \frac{n-2}{n}) \\ p_3 &=& 1 - \frac{(n-1)(n-2)}{n^2} = \frac{n^2 - (n^2 - 3n + 2)}{n^2} \\ p_3 &=& \frac{3n-2}{n^2} \\ p_4 &=& 1 - (\frac{n-1}{n})(\frac{n-2}{n})(\frac{n-3}{n}) \\ p_4 &=& 1 - \frac{(n^2-3n+2)(n-3)}{n^3}\\ p_4 &=& 1 - \frac{n^3-3n^2+2n - 3n^2 +9n - 6}{n^3}\\ p_4 &=& \frac{3n^2-2n + 3n^2 - 9n + 6}{n^3}\\ p_4 &=& \frac{3n^2 + 3n^2 - 11n + 6}{n^3}\\ \end{eqnarray*}

\begin{eqnarray*} E &=& 2p_2 + 3p_3 + 4p_4 + ... (n+1)p_{n+1} \\ \end{eqnarray*} Now, am on the right track? That is, is what I have so far correct?
Thanks,
Bob

5

There are 5 best solutions below

0
On BEST ANSWER

The number $n$ of bins is given, and fixed. Denote by $E(j)$ $(0\leq j\leq n)$ the expected number of additional tosses when there are $j$ empty bins and the game is not yet over. Then we have the following recursion: $$E(j)=1+{j\over n}E(j-1)\quad(n\geq j\geq 1),\qquad E(0)=1\ .$$ This gives $$\eqalign{ E(n)&=1+{n\over n}E(n-1)\cr &=1+{n\over n}\left(1+{n-1\over n}E(n-2)\right)\cr &=1+{n\over n}\left(1+{n-1\over n}\left(1+{n-2\over n}E(n-3)\right)\right)\cr &=\ldots\cr}$$ and leads to $$E(n)=1+\sum_{k=1}^n{n(n-1)\cdots\bigl(n-(k-1)\bigr)\over n^k}\ ,$$ as in Joseph Eck's answer.

6
On

Use the pigeonhole principle.

If the number of tosses ($n$) $\geq b+1$ , then at least one bin will contain $2$ balls. If the number of tosses is $0 < n < b$, the probability that each ball will go into a different bin is $$\frac{1 \times 2 \times ...(b-n+1)}{b^n} = \frac{b!}{n!\times b^n}$$

So the probability that a bin contains more than one ball after $n$ tosses is $$1-\frac{b!}{n!\times b^n}$$

So the expected number of tosses can be found with $$E(b) = 1+ \sum_{k=1}^{b} \frac{b!}{\left(b-k\right)!b^k}$$

Visualized, this looks like:

enter image description here

0
On

Let $T$ be the number of tosses it takes to get $2$ balls in $1$ of the bins. If you are still tossing balls after $i$ tosses it is because none of the bins has more than $1$ ball in it, so there is $1$ ball in $i$ bins $$ \Pr(T=i+1|T>i) = \frac{i}{n} .$$

Next, using Bayes' rule

$$ \Pr(T=i+1|T>i) = \frac{\Pr(T=i+1)}{\Pr(T>i)} ,$$

which implies that

$$ \Pr(T=i+1) = \Pr(T>i)\Pr(T=i+1|T>i) .$$

Since

$$ \Pr(T>i) = 1 - \sum_{k=1}^i\Pr(T=k) ,$$

and letting $p_i =\Pr(T=i+1)$, it follows that

$$p_2 = \frac{1}{n} \quad\text{and}\quad p_{i+1}=\frac{(n+1-i)i}{n(i-1)}p_{i} \quad\text{for}\; i\ge 3. $$

Then, $$\mathbb E[T] = \sum_{i=2}^n p_i i,$$ which I have not been able to solve analytically but have written the following Matlab code to compute it:

N = 100;
E = zeros(N,1);
for n = 2:N    
    p = zeros(n+1,1);    
    p(2) = 1/n;    
    for i = 2:n
        p(i+1) = p(i)*(n+1-i)*i/(n*(i-1));
    end    
    E(n) = sum(p'*(1:n+1)');
end
plot(2:N,E(2:N,1))

The code generates the following plot with $n$ in the $x$-axis and $\mathbb E[T]$ in the $y$-axis:

enter image description here

0
On

Let $p_i$ denote the number of ways we can obtain $i$ tosses.
Obviously $2\le i\le (n+1)$

To find $p_i$ -

1.First find the number of ways we can put $(i-1)$ balls in the bins such that no two go into the same bin

Number of ways = $^np_{i-1}$

2.Put the next ball in one of the bins that already has a ball

Number of ways = $i-1$

$\therefore p_i=(i-1)^np_{i-1}$


$\therefore$ The total number of ways=$$\sum^{n+1}_{i=2}(i-1)^np_{i-1}$$

Putting the values in each case, we can obtain the required expected value for the number of tosses as $$\sum^{n+1}_{i=2}\frac{i(i-1)^np_{i-1}}{\sum^{n+1}_{i=2}(i-1)^np_{i-1}}=\frac{\sum^{n+1}_{i=2}i(i-1)^np_{i-1}}{\sum^{n+1}_{i=2}(i-1)^np_{i-1}}$$

0
On

Various answers have been already provided, but let me show a different approach to get it.

At the first toss you have $p_1=0$ and you end up for sure with 1 one-ball bin.

At the second, either you hit the one-ball (prob. $p_2=1/n=(1-p_1)1/n$) or you end up with 2 one-ball bins (prob. $q_2=1-p_2=(n-1)/n$).

At this point, pay attention to the fact that $$ \bbox[lightyellow] { \pi _{\,k} = {{k - 1} \over n}\quad \quad \mu _{\,k} = 1 - \pi _{\,k} = {{n - \left( {k - 1} \right)} \over n} }$$

are transitional probabilities, i.e. conditional probabilities, and precisely:
- $\pi _{\,k}$ is the prob. of having a "hit" at the $k$-th toss, given that you did not have any hit before;
- $\mu _{\,k}$ is the prob. of not having a "hit" at the $k$-th toss, given that you did not have any hit before.
while $p_k$ and $q_k$ are the overall probabilities that, starting with all bins empty,
- we have a hit at (exactly) the $k$-th toss ($p_k$);
- we have (exactly) $k$ one-ball bins at the $k$-th toss, i.e. no hit up to that toss ($q_k$);

Thus it shall be made clear that
- $\pi_k \ne p_k$ and $\mu_k \ne q_k$ notwithstanding that in the first two tosses they coincide;
- $q_k \ne 1-p_k$, because $1-p_k$ is the probability to have a hit at a toss different from $k$, while $1-q_k$ is the probability to have a hit after the $k$-th toss;

So $$ \left\{ \matrix{ q_{\,3} = \mu _{\,1} \mu _{\,2} \mu _{\,3} = \left( {1 - \pi _{\,1} } \right)\left( {1 - \pi _{\,2} } \right)\left( {1 - \pi _{\,3} } \right) = q_{\,2} \left( {1 - \pi _{\,3} } \right) = \hfill \cr = {{n\left( {n - 1} \right)\left( {n - 2} \right)} \over {n^{\,3} }} = {{n^{\,\underline {\,3\,} } } \over {n^{\,3} }} \hfill \cr p_{\,3} = \left( {1 - \pi _{\,2} } \right)\pi _{\,3} = \left( {1 - \pi _{\,1} } \right)\left( {1 - \pi _{\,2} } \right)\pi _{\,3} = q_{\,2} \pi _{\,3} = \hfill \cr = q_{\,2} - q_{\,3} = {{n - 1} \over n}{2 \over n} \hfill \cr p_{\,3} \quad \ne \quad 1 - q_{\,3} \hfill \cr} \right. $$ and in general $$ \bbox[lightyellow] { \left\{ \matrix{ q_{\,k} = {{n^{\,\underline {\,k\,} } } \over {n^{\,k} }}\quad \left| {\;0 \le k} \right. \hfill \cr p_{\,k} = q_{\,k - 1} - q_{\,k} = q_{\,k - 1} {{k - 1} \over n} = {{\left( {k - 1} \right)\,n^{\,\underline {\,k - 1\,} } } \over {n^{\,k} }}\quad \left| {\;1 \le k} \right. \hfill \cr} \right. }$$ where $x^{\,\underline {\,k\,} }$ represents the Falling Factorial

From that we obtain $$ \left\{ \matrix{ q_{\,0} = q_{\,1} = 1 \hfill \cr q_{\,m} = 0\quad \left| {\;n < m} \right. \hfill \cr \sum\limits_{1\, \le \;k} {p_{\,k} } = \sum\limits_{1\, \le \;k\; \le \,n + 1} {p_{\,k} } = \sum\limits_{1\, \le \;k\; \le \,n + 1} {\left( {q_{\,k - 1} - q_{\,k} } \right)} = q_{\,0} - q_{\,n + 1} = 1 \hfill \cr} \right. $$ since it is clear that at toss $n+1$ we have a hit for sure, if we did not have before ($\pi_{n+1}=1$).

Coming to the computation of the expected value, we have $$ \bbox[lightyellow] { \eqalign{ & E(k;\,n) = \sum\limits_{1\, \le \;k\; \le \,n + 1} {k\,p(k,n)} = \sum\limits_{1\, \le \;k\; \le \,n + 1} {\,{{k\left( {k - 1} \right)\,n^{\,\underline {\,k - 1\,} } } \over {n^{\,k} }}} = \cr & = \sum\limits_{1\, \le \;k\; \le \,n + 1} {\left( {k\,q(k - 1,n) - k\,q(k,n)} \right)} = \cr & = \sum\limits_{1\, \le \;k\; \le \,n + 1} {\left( {\left( {k - 1} \right)\,q(k - 1,n) - k\,q(k,n) + q(k - 1,n)} \right)} = \cr & = - \left( {n + 1} \right)q(n + 1,n) + \sum\limits_{1\, \le \;k\; \le \,n + 1} {q(k - 1,n)} = \sum\limits_{0\, \le \;k\; \le \,n} {q(k,n)} \cr} }$$ where the change of notation is obvious.
It coincides with answer already given by C. Blatter.

Now, $q(k,n)$ can be written as $$ q(k,n) = {{\,n^{\,\underline {\,k\,} } } \over {n^{\,k} }} = k!\left( \matrix{ n \cr k \cr} \right)\left( {{1 \over n}} \right)^{\,k} = \prod\limits_{0\, \le \;j\; \le \,k - 1} {\left( {1 - {j \over n}} \right)} = {{\,\Gamma (n + 1)} \over {n^{\,k} \,\Gamma (n - k + 1)}} $$ but unfortunately I could not find (and doubt there is) a closed formula for the sum.

An asymptotic approximation for large $n$ could however be constructed, following various approaches, depending on the goals that you may have.