How to square equations involving random variables.

139 Views Asked by At

There are $n$ bottomless slots, meaning they can each hold as many balls as needed. I throw balls in a way that each ball is equally likely to land in any of them. How many throws will I need to have at least one ball in all the slots?

Now, let's say the random variable describing the throws needed is $T_m$ when $m$ slots already have balls and $n-m$ are empty. It's easy to see that $T_m$ satisfies the following recurrence:

$$T_m = 1 + I\left(\frac{m}{n}\right)T_m+\left(1-I\left(\frac{m}{n}\right)\right)T_{m+1} \tag{1}$$

Here, $I(q)$ is a Bernoulli random variable.

This simplifies to:

$$T_m\left(1-I\left(\frac{m}{n}\right)\right)=1+T_{m+1}\left(1-I\left(\frac{m}{n}\right)\right) \tag{1}$$

We are interested in $T_0$ and it's not hard to see $E(T_0)=n\sum\frac{1}{k}$.

But I'm more interested in $T_{n-1}$. This is a geometric random variable with $p=\frac{1}{n}$. So, we get:

$$E(T_{n-1}) = \frac{1}{p} = n \tag{2}$$

Substitute $m=n-1$ in equation (1) and using the fact that $T_n=0$, we get: $$E(T_{n-1})=n$$

and this is the same as equation (2). So far, so good. The problem is when I try to find the variance of $T_{n-1}$. For this, I square equation (1). After some algebra and taking expectations, I get:

$$E(T_m^2) = \frac{n}{n-m}+E(T_{m+1}^2)+2E(T_{m+1})$$

Now substitute $m=n-1$ as before.

$$E(T_{n-1}^2) = n + E(T_n^2) + 2E(T_{n})$$

But we know that $T_n=0$. So this gives us:

$$E(T_{n-1}^2)=n$$

But this can't be right since $E(T_{n-1})=n$ as well and the two equations above would make the variance negative.

What am I missing here?

2

There are 2 best solutions below

6
On BEST ANSWER

For simplicity let $S=T_{n-1}$. The correct results are as follows, where $S'$ and $S$ are identically distributed.

$$S=1.\frac{1}{n}+(1+S')\frac{n-1}{n}$$

$$S^2=1.\frac{1}{n}+(1+S')^2\frac{n-1}{n}$$

These give $E(S)=n$ and $E(S^2)=2n^2-n$.

So, what has gone wrong in your derivation? In essence, you cannot square the equation in the way you have done. Just to pick up one problem, the $T_{n-1}$s on each side of the equation are not equal; they are different random variables with the same expectation. Therefore you can only manipulate them in the way you appear to have done once you have taken the expectation.

0
On

The answer by @S. Dolan was very helpful. But I think it might be possible to square equations involving random variables after all, provided we are careful which of terms are clones (i.i.d) and which are mirror images (literally the same thing). Take equation (1) from the question. We have:

$$T_{m}-1 = I\left(\frac{m}{n}\right)T_m'+\left(1-I\left(\frac{m}{n}\right)\right)T_{m+1}$$

Here, $T_m$ and $T_{m}'$ are clones (i.i.d random variables) as S Dolan pointed out. But, the two $I(\frac{m}{n})$ terms are mirror images, completely identical. This means that $I(\frac{m}{n})(1-I(\frac{m}{n}))=0$. Once we take note of this, all cross terms will cancel out. Also, squaring a Bernoulli results in the same Bernoulli. This leads us to the correct recurrence:

$$(T_{m}-1)^2 = I\left(\frac{m}{n}\right)T_m'^2+\left(1-I\left(\frac{m}{n}\right)\right)T_{m+1}^2$$