A combinatorics problem in statistics

56 Views Asked by At

Let $N$ be an integer greater than $100$ and let $m$ be a positive integer less than $N$.

Consider a population of $N$ numbers $X_1, X_2, \ldots, X_N$.

Randomly pick $m$ numbers from the population with replacement to form a sample. For every $2$ numbers, say $X_i$ and $X_j$ in the sample, find its product $X_iX_j$. (I know there are in total $\binom{m}{2} X_iX_j$'s found in one sample).

Repeat the process by picking another sample and find another $\binom{m}{2} X_iX_j$'s until all possible different samples are found (I know that are $N^m$ samples in total).

I am now stuck to find out, among the $(N^m)\binom{n}{2}X_iX_j$, how many of them are in fact $(X_i)^2$, i.e. $i=j$?

This question appeared when I tried to show that the unbiased variance of the sample is indeed unbiased. I don't want to use any formula of expectation but but prove the result from scratch.

I uploaded my work for your reference. Many many thanks.

enter image description here

1

There are 1 best solutions below

2
On

Pick one number. For instance $X_1$ and wonder how many samples exist in which this number is present exactly $k$ times. In such samples we find $\binom{k}2$ products of the shape $X_1^2$.

There are $\binom{m}{k}$ spots for the number $X_1$ and on the other spots another number must stand so that there are $\binom{m}{k}(N-1)^{m-k}$ samples that contain $X_1$ exactly $k$ times.

This leads to a total of $$N\sum_{k=2}^m\binom{m}{k}\binom{k}2(N-1)^{m-k}=\frac12Nm(m-1)\sum_{k=2}^m\binom{m-2}{k-2}(N-1)^{m-k}=$$$$\frac12Nm(m-1)\sum_{k=0}^{m-2}\binom{m-2}{k}(N-1)^{m-2-k}=\frac12Nm(m-1)N^{m-2}=\frac12N^{m-1}m(m-1)$$products of the form $X_i^2$.


I hope I understood well and did not make any mistakes. Further I would not be surprised if a more elegant method exists.