Expectation of sampling uniformly

84 Views Asked by At

Consider the following situation, there are $m$ numbers $n_i$ in a set set $S$. At each time $t$, $k$ numbers are selected uniformly and without replacement and averaged. Then the points are put back into the set. What is the expected value of this average?

I believe it is:

$$ \frac{1}{m}\sum_{i=1}^m n_i $$

But I cannot prove it. Is anyone able to confirm this? Clearly there are $m \choose k$ possible sets each having the same probability of being chosen but I cannot get much further than that.

2

There are 2 best solutions below

0
On BEST ANSWER

Denote the numbers that are selected by $X_{1},\ldots,X_{k}$.

These are random variables with uniform distribution: $P\left\{ X_{j}=n_{i}\right\} =\frac{1}{m}$ for each $j$ and each $i$.

There is no independency but that is not relevant here.

Note that:

$\mathbb{E}X_{j}=\frac{1}{m}\sum_{i=1}^{m}n_{i}$ for each $j$.

$\overline{X}:=\frac{1}{k}\left(X_{1}+\cdots+X_{k}\right)$ denotes the average and:

$\mathbb{E}\overline{X}=\frac{1}{k}\left(\mathbb{E}X_{1}+\cdots+\mathbb{E}X_{k}\right)=\mathbb{E}X_{1}=\frac{1}{m}\sum_{i=1}^{m}n_{i}$

This because in general $\mathbb{E}\left(cY\right)=c\mathbb{E}Y$ and $\mathbb{E}\left(Y+Z\right)=\mathbb{E}Y+\mathbb{E}Z$.

0
On

Here's one way to do it:

Recall that $$ S\equiv\left\{ n_{1},n_{2},\ldots,n_{m}\right\} . $$ Consider the subsets of $S$ of size $k$ containing $n_{i}$ for some $i$. There are exactly $\left(\begin{array}{c} m-1\\ m-k \end{array}\right)=\left(\begin{array}{c} m-1\\ m-1-\left(k-1\right) \end{array}\right)=\left(\begin{array}{c} m-1\\ k-1 \end{array}\right)$ such subsets.

Now, consider a particular subset of $S$, $A$, with $\left|A\right|=k$. The mean of this subset is $$ \frac{1}{k}\sum_{a\in A}a. $$ Hence the expectation of the procedure you described is \begin{align*} E & \equiv\frac{1}{\left(\begin{array}{c} m\\ k \end{array}\right)}\sum_{\substack{A\subset S\\ \left|A\right|=k } }\frac{1}{k}\sum_{a\in A}a\\ & =\frac{1}{\left(\begin{array}{c} m\\ k \end{array}\right)k}\sum_{\substack{A\subset S\\ \left|A\right|=k } }\sum_{a\in A}a. \end{align*} We established that each $n_{i}$ appears in one of the subsets of size $k$ exactly $\left(\begin{array}{c} m-1\\ k-1 \end{array}\right)$ times. Therefore, the sum above can be simplified to \begin{align*} E & =\frac{1}{\left(\begin{array}{c} m\\ k \end{array}\right)k}\sum_{a\in S}\left(\begin{array}{c} m-1\\ k-1 \end{array}\right)a\\ & =\frac{\left(\begin{array}{c} m-1\\ k-1 \end{array}\right)}{\left(\begin{array}{c} m\\ k \end{array}\right)k}\sum_{a\in S}a \end{align*} and you can verify that $$\frac{\left(\begin{array}{c} m-1\\ k-1 \end{array}\right)}{\left(\begin{array}{c} m\\ k \end{array}\right)k}=\frac{1}{m}.$$