Problem of probability distribution

440 Views Asked by At

A box contains $N$ tickets numbered $1, 2, 3,...., N$. If $m$ tickets are drawn one by one from the box without replacement, then find the mean of the sum of the numbers obtained on the tickets drawn.

I have approached the sum as below.

Let $X_i$ denote the number on the $i$th ticket drawn, where $i= 1, 2,..., m$.

The sum of the numbers obtained on the tickets drawn is $S= \sum_{i=1}^{m}X_i$

Hence, the required mean $= E(S) =E(\sum_{i=1}^{m}X_i) =\sum_{i=1}^{m}E(X_i).$

Each $X_i$ can take the values $1, 2,..., N$ with probability $\frac{1}{N}$. Then $E(X_i)= \frac{N+1}{2}$

So, $E(S)= \frac{m(N+1)}{2}$

But my doubt is in the above line 'Each $X_i$ can take the values $1, 2,..., N$ with probability $\frac{1}{N}$.' Because when the drawing is done without replacement, after each draw, the number of tickets remaining decreases by 1. So the number of values left for $X_2$ is $N-1$, not $N$. So how can the probability be $\frac{1}{N}$?

Will anyone please explain where is the mistake? Thanks in advance.

2

There are 2 best solutions below

0
On BEST ANSWER

So how can the probability be $\frac1N$?

The probability is indeed not $\frac1N$ if you have drawn at least one ticket. You have to look at the probability to draw a combination. It is a good idea is to look on a small example.

Let´s say a box contains $3$ tickets numbered $1,2,3$. The probabilities to draw number $i$ at the first draw are $P(X_1=1)=P(X_1=2)=P(X_1=3)=\frac13$.

Thus $\mathbb E(X_1)=\sum\limits_{i=1}^3 p(x_i)\cdot x_i=\frac13\cdot 1+\frac13\cdot 2+\frac13\cdot 3=2$

Now what is $\mathbb E(X_2)$?

The probability to draw the combinations $(i,j)$ are

$P(X_2=(1,2))=\frac13\cdot \frac12\cdot 2=\frac13$, $P(X_2=(1,3))=\frac13\cdot \frac12\cdot 2=\frac13$, $P(X_2=(2,3))=\frac13\cdot \frac12\cdot 2=\frac13$

The corresponding means are $1.5,2,2.5$. It follows that the expected value is

$\mathbb E(X_2)=\frac13\cdot (1.5+2+2.5)=2$

It is much more difficult to make a proof for an arbitrary $N$. But you can use the Linearity of expectation.

For a uniform, discrete random variable we have $\mathbb E(X_1)=\frac{N+1}2$. Thus $\mathbb E(X_i)=\frac{N+1}2 \ \ \forall \ 1\leq i\leq N$

0
On

Your calculation is actually correct, though it is not trivial to see it.

On one hand your argument is right, after the first number is drawn, the second number ($X_2$) only has one of $N-1$ possible values. Even worse, if $X_1=1$, then the expected value of $X_2$ will be $\frac{2+3+\ldots+N}{N-1}=\frac{N^2+N-2}{2(N-1)} > \frac{N+1}2.$

OTOH, the first number could also be $X_1=N$, making the expected value for $X_2$ smaller than $\frac{N+1}2$. In the end, all of this will balance out. It should be pretty intuitive to see that if you don't consider what happens before, the value of $X_i$ can be $1,2,\ldots,N$, each with the same probability $1 \over N$. Overall, there is no higher probability to draw a $4$ than a $7$ on the third draw, for example.

The step that most people have problems with is when you wrote

$$E(\sum_{i=1}^mX_i) = \sum_{i=1}^mE(X_i)$$

This is correct, but most people's intuition is against this, because they say "the $X_i$ are dependend". It turns out, using the integral definition, that the equation is true even if the $X_i$ are dependend (which they of course are in our case).